Methodology

Measurements were conducted in a manner that is statistically valid, repeatable, technically consistent and providing absolute comparability of KPIs for all networks and technologies. All KPIs measured in tests are defined in ETSI standards and ITU recommendations.

The scope of this year project is the biggest among all done by RATEL in five-year history of benchmarking Serbian mobile operators. It included drive test, hotspot locations and railways to the extend never seen before in Serbian market.

The map of covered routes is presented on figure 1.

Categorisation applied is precisely following ETSI 103 559 Annex A document.

Drive Test

The campaign covered 55 cities and 15,000 km of Serbian roads and was divided into area type categories: Big Cities, Medium Cities, Small Cities and Highways, Main Roads, Rural Roads. The categorisation was precisely following the ETSI 103 559 Annex A recommendations. The number of tests samples collected during drive tests (in each network over 8000 voice calls and around 8000 of every data service tests were done) allows to receive accuracy better than 3% for standard error with confidence level better than 95%.

 
Big Cities:
  • Beograd
  • Kragujevac
  • Niš
  • Novi Sad
  • Subotica
Medium Cities:
  • Bor
  • Čačak
  • Jagodina
  • Kikinda
  • Kraljevo
  • Kruševac
  • Leskovac
  • Loznica
  • Novi Pazar
  • Pančevo
  • Pirot
  • Požarevac
  • Prokuplje
  • Šabac
  • Smederevo
  • Sombor
  • Sremska Mitrovica
  • Užice
  • Valjevo
  • Vranje
  • Vršac
  • Zaječar
  • Zrenjanin
Small Cities:
  • Aleksinac
  • Apatin
  • Aranđelovac
  • Bačka Palanka
  • Bačka Topola
  • Becej
  • Bujanovac
  • Gornji Milanovac
  • Inđija
  • Kula
  • Lazarevac
  • Negotin
  • Obrenovac
  • Paraćin
  • Preševo
  • Priboj
  • Prijepolje
  • Ruma
  • Sjenica
  • Smederevska Palanka
  • Stara Pazova
  • Temerin
  • Vrnjačka Banja

Measurements were performed in a drive test mode, which means that the measurement equipment was installed in moving vehicles. Measurement equipment collects network data by running voice and data tests and using a scanner to obtain radio network parameters. All three mobile networks were measured at the same time and on the same drive test routes using the same smartphones – Samsung Galaxy S10 for voice, Whatsapp messaging and Sony Xperia 1 II for data. The Samsung Galaxy S10 was equipped with operators’ firmware to support all latest networks features (including VoLTE – Voice over LTE calls). The Sony Xperia 1 II is a Cat. 19 mobile device, which means that it supports 4G (LTE) data speeds up to 1600 Mbps for data receiving and 300 Mbps for data sending. All smartphones worked in automatic technology selection mode. To reflect the latest technical developments in the mobile networks and to examine the benefits from available capabilities, SIM cards with the most comprehensive mobile tariff plans (tariff plans with the highest data rates, highest number of minutes, largest amount of data volume) available from each operator were used.

Mobile Tariff plans used for testing networks are shown in Table 1:

Operator Tariff plan used for voice tests Tariff plan used for data tests
Telekom Srbija SOKO SOKO
Telenor Total Data+ Total Data+
A1 NeoNeoMax NeoNeoMax

Table 1. Mobile Tariff plans – Drive test

The measurement system consisted of two test cars equipped with identical measurement equipment (SwissQual Diversity Benchmarker II) capable of measuring all network technologies and services simultaneously to a very high accuracy. In order to perform voice tests, the Samsung Galaxy S10 smartphones permanently called each other, within the same mobile network. In Big Cities, voice tests were executed in mobile to mobile scenario between two cars, in Medium and Small Cities and on the Roads mobile to mobile calls were performed within the same car. The aforementioned specific setup in Big Cities allowed for effective data collection without performing too much voice traffic within the single radio cell in areas where higher mobile traffic is expected. Voice tests assess network accessibility, retainability and quality of speech. Voice calls with 85 seconds call duration were measured during benchmarking.

Test equipment installed in cars is presented in Figure 2.

Fig. 2. Test Equipment installed in cars

The receiving or sending of additional data during the voice test call was added to the measurement scenario in order to simulate behaviour of a regular subscriber using a smartphone device, for which background data transmissions are typical during the voice call. For each of the voice calls, the quality of speech samples was measured (MOS – Mean Opinion Score) using the standard POLQA P.863 algorithm.

Data tests were performed using Sony Xperia 1 II smartphones and a dedicated measurement server located at the Serbian Open eXchange (SoX) in Belgrade, which ensured a fair transmission path to all three mobile networks. Data tests assess the network availability, stability, typical performance and highest capabilities. The most representative data services measured during benchmarking were:

  • Small file Transfer - Download (throughput of a 10 MB file transmission over HTTP protocol from the measurement server to the smartphone). The small file transfer Download test is designed to measure the responsiveness of the network and simulate a user downloading small files such as pictures, mp3 files or email attachments.
  • Small file Transfer - Upload (throughput of a 5 MB file transmission over HTTP protocol from the smartphone to the measurement server). The small file transfer Upload test is designed to measure the responsiveness of the network and simulate a user uploading small files such as pictures, mp3 files or email attachments.
  • Web Browsing Static (testing how fast the reference ETSI Kepler web page is received and opened on smartphones).
  • Web browsing Live Page (testing how fast the real web pages are received and opened on smartphones).
  • YouTube (testing the quality of live stream video transmission).
  • Ping (measuring delay between sending and receiving packets inserted by the network).

To simulate the behaviour of average mobile subscriber in Serbia surfing the Internet, a set of websites was tested, based on their popularity amongst Serbian users:

https://www.twitter.com
https://www.yahoo.com
https://www.kupindo.com
https://halooglasi.com/

In the YouTube test, the quality of a live stream was measured (VMOS - Video Mean Opinion Score) using the standard J.343.1 algorithm.

This year for the first time the testing of Whatsapp messaging was added to the setup. A pair of Samsung Galaxy S10 terminals was included in the measurement systems. The sending and receiving were served by separate terminals. The target with this setup is collecting end user experience of Whatsapp messaging service over mobile networks.

A scanner (Rhode&Schwarz TSME) was used to test radio parameters of the mobile networks. SwissQual NQDI software was used for network data post-processing and reporting. The post-processing activity was supported by the use of the Systemics-PAB proprietary Data Warehouse for a customized analysis.

 

Walk test

The campaign was divided into two parts as per ETSI 103 559 Annex A document. Hotspot measurements covered locations in Belgrade, Novi Sad and Nis. Railway corridors measurements included 1800km of main Serbian railway routes. Corridor X and Corridor XI.

Measurements in walk test locations were performed in a nomadic mode, which means that the measurement equipment was installed in a backpack carried by measurement engineer.

Measurement equipment collects network data by running voice and data tests and using a scanner to obtain radio network parameters. All three mobile networks were measured at the same time and on the same walk test routes using the same smartphones – Samsung Galaxy S10 for voice and Whatsapp, data services were tested with Sony Xperia 1 II. The Samsung Galaxy S10 was equipped with operators’ firmware to support all latest networks features (including VoLTE – Voice over LTE calls). The Sony Xperia 1 II mobile device, which means that it supports 4G (LTE) data speeds up to 1600 Mbps for data receiving and 300 Mbps for data sending. All smartphones worked in automatic technology selection mode. To reflect the latest technical developments in the mobile networks and to examine the benefits from available capabilities, SIM cards with the most comprehensive mobile tariff plans (tariff plans with the highest data rates, highest number of minutes, largest amount of data volume) available from each operator were used.

Mobile Tariff plans used for testing networks are shown in Table 2:

Operator Tariff plan used for voice tests Tariff plan used for data tests
Telekom Srbija SOKO SOKO
Telenor Total Data+ Total Data+
Vip mobile NeoNeoMax NeoNeoMax

Table 2. Mobile Tariff plans – HotSpots

The measurement system consisted of the backpack (SwissQual Freerider III) equipped with identical measurement terminals capable of measuring all network technologies to a very high accuracy. In order to perform voice tests, the Samsung Galaxy S10 smartphones permanently called each other, within the same mobile network. Voice tests were executed in mobile to mobile scenario between terminals in the same backpack. The Sony Xperia 1 II was used to perform data tests. The same measurement scenarios as in the drive test were used for hotspot measurements.

A scanner (Rhode&Schwarz TSME) was used to test radio parameters of the mobile networks. SwissQual NQDI software was used for network data post-processing and reporting. The post-processing activity was supported by the use of Systemics-PAB proprietary Data Warehouse for a customized analysis.

Test equipment used in Walk Test measurements is presented in Figure 3.

Test Equipment used in Hotspot measurements
Test Equipment used in Hotspot measurements

Fig. 3. Test Equipment used in Walk Test measurements

Walk test measurement results are a part of ETSI 103 559 Annex A scoring methodology and are considered for final score calculation.

 

Hot Spot

Hot spot measurement included outdoor and indoor location. The locations covered were chosen based on criteria of popularity and public importance.

Belgrade
  • Ada Ciganlija Lake
  • Airport Nikola Tesla
  • Belgrade Waterfront
  • Beo Shopping Mall
  • Galerija Shopping Mall
  • Knez Mihajlova & Kalamedgan
  • Usce Shopping Mall
Novi Sad
  • Novi Sad pedestrian zone
  • Promenada Novi Sad
Nis
  • Nis pedestrian zone

Hot spot measurements results are a part of ETSI 103 559 Annex A scoring methodology and are considered for final score calculation.

Scoring Methodology

The scoring methodology was developed for assessing user perception of voice, data and video services after the benchmarking measurements, with the purpose of assessing the ranking of the measured mobile network operators.

The main challenge is to select Key Performance Indicators (KPIs) which represent true user experience, and to develop the algorithm which allows the calculation of the unified quality metric for every operator – the score. In the ETSI 103 559 Annex A approach, the score reflects user experience and is focused on issues which have high influence on users’ dissatisfaction.

For every service tested, there are at least two metrics identified as meeting the above mentioned idea of scoring.

The usability of the service is assessed using Call or Session Setup Success Rate (CSSR for voice tests or SSSR for data tests). CSSR/SSSR is an indicator tested for all types of the services and it is considered as a ratio of successful tests setup to all test attempts. The exact method of calculation is presented with the formula below:

 
CSSR/ SSR = All successful setup attempts for the service
all test attempts for the service
 

The second metric is the quality of the given service itself. This metric will depend on the type of test executed. For voice calls it is assessed as speech quality, while for data tests it is assessed as data throughput, session time or video stream quality in case of YouTube tests. For services where it is important to have the consistent quality of the service, the usability of the service is additionally assessed by calculating the percentage of tests or samples with bad quality (voice samples with a MOS <1.6 for voice tests, data transfer tests with a throughput below the minimum expected value). This approach reflects how the network is dealing with the consistency of quality (distribution of the quality of the service should be close to an average value).

MOS stands for Mean Opinion Score for quality of voice services. It measures subjective perception of the voice quality by the listener. It ranges between 1 and 5, with 5 being the best.

For each of the KPIs measured, there is a threshold assumed as a minimum requirement which has to be met in order to achieve scoring points. This threshold is set at the level representing the status of technology development (minimum data throughputs meeting assumptions for the implementation of the 3G or 4G technologies), the usable quality of service (MOS threshold below which speech quality is poor), network accessibility levels, corresponding to the grade of service used for telecommunication services.

The quality KPIs also have maximum values used for scoring calculations, which is interpreted as a target above which users do not feel the real difference in quality of service, and due to this effect, it is not a differentiating factor in the user’s perception. Operators are not awarded with additional points for exceeding this maximum threshold.

For all KPIs, a linear function is used to calculate how a given KPI meets expectations (within min-max ranges). If a higher value of the KPI is better, the particular score is calculated straight forward (i.e. speech quality or throughput), if a lower value is better, calculations are performed in a reversed way (i.e. call setup time or share of samples with low throughput).

Data gathered during benchmarking tests are used to calculate ranking scores for all measured mobile operators for voice and data services.

The importance of the voice service is set to 40%, while all data services sum up to a total of 60%. Details are presented below.

  • Voice services, affecting 40% of the overall score
  • Web browsing, affecting 22.80% of the overall score
  • Data services, affecting 15% of the overall score
  • YouTube, affecting 13.20% of the overall score
  • Messaging, affecting 9% of overall score

The first step for calculating the single score is to calculate the score for the different aggregations as:

Cities:

  • Big Cities
  • Medium Cities
  • Small Cities

Roads:

  • Highways
  • Main Roads
  • Rural Roads

Complementary Areas – Walk test:

  • Hotspots
  • Railways

Particular area scores are calculated separately using the defined weights and thresholds for types of services tested. The final score is a weighted aggregation of the scores obtained in these areas. The areas scores contribute to the final scoring with weights relevant to their importance. The following weights are used for the aggregation Cities: Big Cities 60%, Small Cities 30% and Small Cities 10%. The following weights are used for aggregation Roads: Highways 60%, Main Roads 30% and Rural roads 10%. For complementary areas the weights are as follows: Hotspots 60% and Railways 40%. The final score is evaulated based on already calculated weighted scores for aggregations. The final score inputs are weighted as follows: Cities 50%, Roads 40% and Complementary Areas 10%. The aggregation over areas and services is shown in figure 5.

aggregation over areas and services
Fig. 5. Aggregation over areas and services

The final score, the same as the scores for each of the individual areas, is presented as single percentage value which can be interpreted as level of fulfilling expectations/capabilities. It ranges from 0 to 100 percent.