Wednesday, November 6, 2019
Investigating how much the 5 pence minimum charge on local calls increases the cost of making local calls Essays
Investigating how much the 5 pence minimum charge on local calls increases the cost of making local calls Essays Investigating how much the 5 pence minimum charge on local calls increases the cost of making local calls Essay Investigating how much the 5 pence minimum charge on local calls increases the cost of making local calls Essay The aim of this coursework is to discover how much the 5 pence minimum charge (to quote BT: the minimum charge for each call remains at 5p inclusive of VAT) on local calls increases the cost of making local calls. This information would then be used to ascertain whether moving to a telephone operator without a minimum fee would be a good idea. If moving would save money, then it would be worth doing. Certainly, looking over the current bill, it seems like there are many calls less than 5 minutes (as local rates are 1 pence per minute during the evening and weekend and the phone is always switched off during the daytime). Data Collection The population is the entirety of local calls made during 3-month period. This population is formed because of the procession of only one itemised bill, which happens to be from 1st of November to the 1st of February. The current telephone operator is BT. This bill (i.e. the population) lists 886 calls, with a total duration of 30 hours, 38 minutes and 57 seconds. The complete cost for this period is Ã ¯Ã ¿Ã ½50.30 (to the nearest 2 d.p., before VAT). The population is continuous because time can be given to any number of decimal places. The data is surly as accurate as it is possible to be only untraceable computer bugs could explain inaccuracies. As computer problems are unlikely for a large multi-national firm, such as BT, we can consider the data within the population to be reliable and good quality. A sample method appropriate here could be random sampling. An example of this might be labelling 886 pieces of paper, throwing them into a hat and selecting 50. Also, a computer could be used to select 50 random integer numbers in the range of 0 and 887. Such a random method would however give no indication of whether the duration of calls remained constant over time. Random sampling may allow a certain cluster of calls to dominate the general trend. Stratified sampling would not be appropriate for the investigation because we want to find out the number of 5-minute calls, not just select a certain number of them. This sampling method would be useful for a sub-investigation, for example to see how much effect the over 5-minute calls make to the charging by only selecting the over 5-minute calls and assessing their magnitude. Another way of sampling would be to select every x call, where x is the result of 886/50 rounded down to the nearest whole integer. This would mean recording the call duration for every 17th call (because 886/50 = 17.72). This will generate more that 50 samples, however, 50 is a minimum not a maximum. As the data is probably accurate, there cannot be any outliners. Please note: a computer maths or spreadsheet package has not been used in the preparation of this report. Thus, the data collected is given bellow (and then sorted by hand): Call duration in seconds Cell duration in seconds (sorted) 53 10 50 11 10 20 120 31 413 33 45 33 274 34 99 35 48 35 50 35 100 35 53 35 54 36 42 36 44 37 11 38 108 38 36 38 38 39 54 41 47 41 41 42 54 42 35 42 47 44 50 45 269 45 42 47 35 47 55 48 35 50 227 50 35 50 33 53 36 53 39 54 170 54 35 54 1077 55 38 57 31 99 33 100 45 107 1266 108 34 120 107 170 20 227 57 269 38 274 37 413 42 1077 41 1266 Table 1 raw data and raw data sorted Analysis Mean A mean gives some idea of central tendency (the average call duration). It will however, be affected by the extreme values 413, 1077 and 1266. Therefore, it seems appropriate to measure the mean with and without the 3 values. With the 3 values (done using a calculator, to 2 d.p.): ==112.56 seconds or 1 minute 52.56 seconds Clearly, this answer is higher than most of the times, so not particularly representative. With out the 3 values (done using a calculator, to 2 d.p. ): ==63.2 seconds or 1 minute 3.2 seconds. The second answer seems more representative, but still does not take account of the majority of the times. Mode The mode is another measure of central tendency that gives the most frequent call duration. The mode is clearly 35 seconds (because this call duration is the most frequent, being repeated 5 times). From a visual inspection, this time seems to give quite a good indication of central tendency as most of the time seem to fall within the 9 t (seconds) 61 range. Median Another measure of central tendency is the median, which may be numerically calculated or read off a calmative frequency graph. As the later is not necessary accurate because hand drawn, the numerical method will be used. The median gives the middle number, and as such is not affected by the extreme numbers. It does not however, take account of the actual values of the call durations. Hopefully, it should give some idea as to the average of the data. To calculate the median, the numbers must each be assigned a rank. The median is the mean of the duration assigned to the 26th and 27th rank (because 52/2 =26 and (52/2)+1=27 and n is even). The ranks, which were assigned manually to the data, are given in the table below: Ranks Cell duration in seconds (sorted) 1 10 2 11 3 20 4 31 5 33 6 33 7 34 8 35 9 35 10 35 11 35 12 35 13 36 14 36 15 37 16 38 17 38 18 38 19 39 20 41 21 41 22 42 23 42 24 42 25 44 26 45 27 45 28 47 29 47 30 48 31 50 32 50 33 50 34 53 35 53 36 54 37 54 38 54 39 55 40 57 41 99 42 100 43 107 44 108 45 120 46 170 47 227 48 269 49 274 50 413 51 1077 52 1266 Table 2 raw data sorted and ranked (by hand) The 26th and 27th values are both 45 seconds. Therefore, the median is 45 seconds ((45+45)/2=45). From visual inspection, this does seem to represent all but the extreme values, however, the mode seems to give the best measure of central tendency. Ranges The range is 10 1266 (=1256). If the finial 3 values were removed, the range would be 10 284. This shows that the 3 values do have an extreme affect upon the calculations that take them into account. The inter-quartile range would give a better measure of spread, as it would avoid the extreme values. The median of the median of the whole data set and the lower bound is the lower quartile, whilst the median of the median of the whole data and the upper bound is the upper quartile. Lower quartile The 13th value is 36, as is the 14th. So, the lower quartile is 36 seconds. Upper quartile The 39th value is 55, and the 40th is 57. the upper quartile is 56 seconds. Thus, the inter-quartile range is 56-36 or 20 seconds. This tells us that the middle half of the sample is with a 20 second range. Since half of 52 is 26, we know that the 26 of the sample falls within the 20 seconds of each other. Thus, we can deduce that the median is a good measure of central tendency. Calculating the midrange would not be applicable, as the extreme values would have too significant an affect on the result. Standard deviation The standard deviation measures average (not accurate because of squaring and square-rooting, so called standard) deviation from the mean. As 2/3s of the data lie within the standard deviation, perhaps it will be a good measure of spread and central tendency. The standard deviation will help tell us whether 2/3rd of the data is within 5 minutes, and if so it may be worth changing the operator. The formula used is: . The squared call durations are given bellow (these squares were calculated by using a calculator). Call duration in seconds Cell duration in seconds (squared) 10 100 11 121 20 400 31 961 33 1089 33 1089 34 1156 35 1225 35 1225 35 1225 35 1225 35 1225 36 1296 36 1296 37 1369 38 1444 38 1444 38 1444 39 1521 41 1681 41 1681 42 1764 42 1764 42 1764 44 1936 45 2025 45 2025 47 2209 47 2209 48 2304 50 2500 50 2500 50 2500 53 2809 53 2809 54 2916 54 2916 54 2916 55 3025 57 3249 99 9801 100 10000 107 11449 108 11664 120 14400 170 28900 227 51529 269 72361 274 75076 413 170569 1077 1159929 1266 1602756 Table 3 call durations sorted and squared (using a calculator) (calculated using un-rounded values, and checked using the statistical function of the calculator.) Therefore, 2/3rd of the call durations lie within 224.89 seconds. 224.89 seconds is 3 minutes 44.89 seconds, showing that 2/3rds of the call durations lie within 5 minutes of the mean. This strongly indicates that changing operator would be beneficial. Number of call durations less than 5 minutes Another useful calculation would be to calculate how many of the call durations fall below 5 minutes (300 seconds). If we consider the sample reasonably representative then we can say that the ratio of under 5 minute to over 5 minute calls is 49:3. If the sample is representative, then this means there is a very high number of calls bellow 5 minutes. Displays Box and whisker diagram A good was of summarising some of the calculations above would be a box and whisker diagram. This is figure 1. Frequency polygon A frequency polygon is an appropriate method of discovering the shape of the distribution (see figure 2 and sheet F2). Please note that the last 3 call lengths were not included, as this would have adversely affected the scale and not given a meaningful display. Figure 2 shows a slight positive skew. There is a tight distribution about 20-60 seconds. After 60 seconds, there are occasional calls lengths. Cumulative frequency diagram Suspecting a tight distribution, a cumulative frequency diagram was created (the straighter the S shape, the tighter the distribution about the median). A cumulative frequency diagram gives better results if the data is spread evenly throughout the intervals; this appears to be the case. Cumulative frequency is calculated as a running total of all frequencies. The bounds are 10, allowing for the fact that the data is widespread. As the data is integer, the bounds are inclusive. Bounds Cell duration in seconds (sorted) Cumulative frequency 0-9 0 0 10-19 2 2 20-29 1 3 30-39 16 19 40-49 11 30 50-59 10 40 60-69 0 40 70-79 0 40 80-89 0 40 90-99 1 41 100-109 3 44 110-119 0 44 120-129 1 45 130-139 0 45 140-149 0 45 150-159 0 45 160-169 0 45 170-179 1 46 180-189 0 46 190-199 0 46 200-209 0 46 210-219 0 46 220-229 1 47 230-239 0 47 240-249 0 47 250-259 0 47 260-269 1 48 270-279 1 49 Table 4 cumulative frequency of call durations. The cumulative frequency graph shows that there is a tight S shape, indicating a tight concentration of values around the median. This confirms the calculations made above (i.e. the inter-quartile range). The top of the S shape is particularly long and straight, indicating widespread but infrequent data past the upper quartile (56 seconds). Interpretation It has been discovered that there is a large concentration of call durations bellow 5 minutes, indicating that moving to a different operator may be advised, although increased pence per minute increase from such a move may negate any benefit. With 49 of the 52 calls less than 5 minutes, the mean, mode, median, standard deviation and quartiles falling less that 5 minutes, it is clear that most calls are much less than 5 minutes duration, according to the sample. The positive skew also supports this point, by showing that calls tend towards lower than 5 minute call durations. It has also been discovered that the 3 calls above 5 minutes are considerably above five minutes. This indicates the pence per minute increase problem outlined above. Therefore, it has been shown that (provided the sample is representative) if call charges are not increased and the 5 pence minimum dropped, moving to another call operator would be sensible. The data of the population is of good quality, as indicated in the introductory paragraphs. The data is very accurate as a computer calculated it automatically and the sample hand gathered 3 times (to insure there was no errors in gathering the sample). The extreme values were therefore verified 2 times. In terms of validation, none of data should (or was) negative. This was manually checked for after sample collection. It certainly appears that the data was worth collecting as it has shown a high biased towards 5-minute call durations. Even if the data had not shown this, it would have told us that changing to another operator might not be such a good idea. With there being only 3 calls over 5 minutes, these 3 calls probably are not representative of the over 5 minute calls in the population. If it can be shown that many of the over 5 minute calls in the sample are considerably over 5 minutes. This could be a problem if there are pence per minute increases with a new non-5-minute-minimum-charge operator: the effect of the increase may negate the effects of the lack of a 5 minute minimum charge. Accuracy and refinements The sample only represents 5.9% (, to 1 d.p.) of the population. Thus, how representative the sample is of the population is questionable, especially as only 3 of the sample were over 5 minutes in duration. Using a telephone can often result in somewhat fixed patterns of calling, thus using a systemic method of data gathering maybe influenced unduly by these routines. When gathering data by eye from a long list of numbers that are similar in appearance (as in this case) may be particularly error prone. Accuracy in long calculations such as standard deviation may not be wholly accurate despite repeating the calculations 2 times and using the calculator as further backup. The impact of other call types, such as national and international has not been assessed; a change in operator may adversely affect these rates. A greater sample and/or population size could improve reliability of the analysis as the sample gathered may be more representative. A greater population could be found by collecting different quarterly bills. A greater sample size could be derived from changing the number of call durations to skip to a number less than 17. To gain a more representative sample of over 5 minute calls, it would be acceptable to carryout a sub investigation on a sample of over 5 minute calls (i.e. a sample of the population of over 5 minute calls). This would help us discover how much of an effect call charges would have on the bill. If all the whole population could be gathered, say by optical character recognition technology, then a computer statistics package could be used to calculate and graph the data. This would eliminate any problems associated with sampling but would be costly, if the hardware and software required was not easily accessible. Also, optical character recognition is not 100% accurate, so the data would have to checked by had a long uncomfortable occupation. There exist more advanced methods of calculating various statistics, and could be utilised to give greater, more refined, insight. An example might be calculating the skew or kurtosis numerically. Another possibility of determining the number of under 5 minute calls would be to manually count the number of over 5 minute calls in the whole population and subtract that number from 886 (the total number of local rate telephone calls). This would give an accurate proportion of under 5 minute calls, but would not give their nature (e.g. how close to the 5 minute barrier are they). Other graphs or charts could be drawn to give further insight into the data set. An example might be a pie-chart showing proportion of over and under 5 minute calls of the sample. A different method of sampling could be employed to avoid the routine problem that a systemic method carries or an unrepresentative problem that a random method carries. Perhaps a mixture of the two could be employed to generate a higher quality sample. This could be that the number of call durations missed before collecting another could be randomly generated using the random number facility on a calculator. Moving a piece of paper or ruler already marked) and can help to ensure that durations are indeed 17 call durations apart (by sight). As a change in operator may affect all charge rates, the affect upon the cost of national and international calls should be assessed or else savings from the removal of the 5 pence minimum charge could be negated and the new operator forsaken (and it may in fact cost to return to BT). There are in fact only 5 international calls, totalling 1 hour 4 minutes in duration and costing a total of Ã ¯Ã ¿Ã ½12.77. With such a small population, statistical analysis in not advised. Merely cost comparing the call charges with other operators should be sufficient. In addition it should be remarked that the cost for the international calls is quite low and would probably not negate a saving made from removing the 5-minute minimum charge. This is especially true as many non-BT operators are constantly advertising their low international rates. There are 57 national calls, of which total duration is 2 hours and 17 minutes, and costing a total of Ã ¯Ã ¿Ã ½5.48. With such a small charge, it hardly seems worth investigating. It would be possible to see how much the same call time (2 hours and 17 minutes) would cost with another operator. It might be advised as a further extension to the investigation, to find out (of the sample data) how many extra minutes are actually charged for because of the 5-minute minimum charge. This would help discover how much one would save when moving to a new operator. In conclusion, it seems that several statements can be made: o The sample is fairly accurate (with the population being as accurate as possible), but could be improved in terms of representation. o The cost of over five minute local calls, national and international calls should be calculated against any increase in charges of another operator to ensure that the increases effect does not take away the effect of removing the 5 minute minimum charge. o Most local rate calls are less than five minutes in duration, but it is possible that some of the over 5 minute calls are particularly long in duration.