Latest news:


Eating away over the winter

Some statistical thoughts
Discussion started by tallliman , 04 July, 2019 09:44
Some statistical thoughts
tallliman 04 July, 2019 09:44
I thought I'd try and lend a basic statistical insight into this season. I welcome those more well versed in stats than me to correct me. The below is a table of batsmen per county who have played at least 8 innings this season (2nd column), the number of those with above median runs (3rd column) and then those with upper quartile runs (4th column).

Of a total of 165 batsmen meeting the criteria, we'd expect about 9-10 for each county (assuming a level of consistency of selection), 4-5 above median and 2-3 in the upper quartile. On this basis, I'd argue that our batting isn't too far off what you'd expect assuming all the data is normally distributed. I'd also suggest that feels a bit odd given it feels like we've struggled to make enough runs....I'll look into the bowling to look at this from the other way.

Dur	9	5	2
Dy	11	7	5
Es	9	5	2
Gm	10	7	3
Gs	7	4	3
Ham	9	3	1
Kent	9	5	4
La	5	3	1
Le	8	4	2
Mx	8	4	2
Nr	11	5	3
Nt	10	5	3
Sm	12	5	2
Sx	10	6	2
Sy	10	7	3
Wk	11	3	2
Wo	7	2	0
Yor	9	3	2
	165	83	42

Edited 1 time(s). Last edit at 04/07/2019 09:46 by tallliman.

Re: Some statistical thoughts
adelaide 04 July, 2019 12:09

Just to clarify, when you refer to medians and quartiles are you referring to the median for players from that county or for all players (with >=8 innings)? Looking at the data, I think it must be all players.

You don't need to assume a normal distribution as by definition half fall on one side of the median.

That is just as well as most batsmens' scores do not follow anything like a normal distribution - too many zeroes and single figure scores, too many clustered around landmarks (particularly for Derbyshire this week. Oh, and no negative scores, which even we have not managed yet! However, if you look at an average (mean), the bigger the sample, the better the normal distribution becomes as a fit for those averages.

I'll take a wild guess that we would have one bowler above the (national) median - and quite possibly above the upper quartile - and the rest below.

That, combined with what I suspect is a large disparity between first and second innings batting, is probably shy we are where we are.


Re: Some statistical thoughts
tallliman 04 July, 2019 22:28
I'll try and do the bowling at some point. Agreed about the normal distribution, that was clumsy from me, I was meaning more that I wanted to be more rigorous than just looking at the median etc.

Re: Some statistical thoughts
Primrose Hillbilly 07 July, 2019 12:19

given that the emphasis in the CC is on getting big first innings totals - hence the way the bonus points are allocated - it would be interesting to see the correlation between your figures and then the number of times the sides have recorded over 350.

I used 350 because even we (this season) should manage not to lose were we to bat first and make that total.

I would not decry the use of statistics as a way of trying to discern overall effectiveness of both individual players and teams.

From what I read, for a long time, Baseball stats revolved around a player's overall average of scoring a hit, and the number of home runs he scored, as well as the number of times he batted other players already "on base" in.
The latter encouraged a lot of players to go for the big hit if they had 2 or 3 guys already in a scoring position, and get out.

Then, some guy started publishing - almost as a hobby - his end of season baseball stats in a limited print run, and it sold out. Others started producing their own and contributing, so the next year he ran a massively larger print run and that sold out.

From that was born the concept of "on base percentage" - the number of times a guy just got to first base, and so made it possible for others before him to move round, and 3 hits later, he would have scored a run.

Dingy Bags told me that he discussed with Clive Radley whether he - with his ability to always nudge or nurdle a single and turn a two into a three - would have been effective at T20, as Rad always "got on base", but Rad said he would not have hit enough sixes to succeed nowadays.

My impression of 2016 was that we usually had a couple of batsmen put their hands up and get a score, and we had a good late order, but we had a very effective bowling attack such that our bowlers out players "theirs", and our 7,8,9 outbatted "theirs" too.

Brearley too, had a bowling attack to call upon that outbowled their peers, although batting points only counted up to 300 in those days, I believe.

Another slight statistical tweak of that time was that in one season, Middlesex discovered that Wayne Daniel was being driven for runs in front of the wicket more often than previously, and discovered that he was aiming for a prize sponsored by The Sun, for the bowler who hit the stumps most often.
The players agreed to club together and make up the difference to him if he didn't win that prize, so normal service in the slip cordon resumed.

I will be very interested to read any further thoughts.

Re: Some statistical thoughts
adelaide 08 July, 2019 00:22

Your Wayne Daniel example is a good illustration of the rule that the moment you set up a performance indicator on which mpney or career progression depend, it begins to lose its value. Best seen in the NHS. People start gaming the system and in the end that subverts what the job is really supposed to be.

Bonus points have been stable for some years now but in their early days the batting points in particular kept changing. At first you got a batting point for each additional batch of 25 runs you got over 150 in the first 85 overs. So if you managed 450 you would have got 12 batting points! Then there was a year where 75 in the first 25 overs was also worth a point.

The most bizarre one that I found when I checked the above was that there was one season (before bonus points) where each first innings was limited to 65 overs. The bizarre bit is that it only applied the first time the two counties met in the season (and not at all if they only met once).

I used to think that Americans were attracted to sports which could be translated into loads of statistics. For NFL matches the screens were full of such stuff years ago. The one that made me squirm was reference to the "winningest" team. However, when the Moneyball film came along, one of its points was how little use NFL management made of the information readily available on players rather than gut instinct. Thus the first team which did use the data could find bargains.

I suspect that Moneyball type thinking has found its way into football here. To be fair, how else can you assess someone playing in Romania's third division? Cricket is surely much more subtle. As has been said many times, Michael Vaughan's batting average would never have got him into the England team on its own. A player's averages are also very much affected by the quality of home pitches.

Statistics were also used to justify the long ball approach in football, on the basis that most goals were scored after three or fewer passes. This led to Taylor's Watford (with the saving grace of John Barnes) and then to copycats like Wimbledon who added brutality to the mix. Arguably it set English football back twenty years.

Statistics is the study of variation, not of averages and that seems to me where the long ball theory went wrong. If your outswinger brings you most of your wickets, you don't exclusively bowl outswingers. If your NFL running game is weak, you still use it to keep the defence honest. Your "three pass" goal may arise because of a poor clearance from a ten pass move, or because the defence was not sure what form of attack was coming.


Re: Some statistical thoughts
BeefyRoberts 11 July, 2019 15:59
Something else for here,regarding bonus points this year.
After the last county games,the upto date tables are of course published.
We have 15 batting points,only Durham have less with 13.
Bowling points,we are bottom of the bonus point amounts with 19.

Re: Some statistical thoughts
tallliman 12 July, 2019 15:51
As requested, the same chart for the bowling didn't really satisfy me using a minimum of 10 wickets. The key bit is that we only have 4 bowlers with 10 wickets or more against an average of 4-5 based on the data. Not sure if I want to read too much into the rest of it to be honest...

Bowlers with 10 wickets	Bowlers with above median wickets	Bowlers with upper quartile wickets
Dur	5	2	2
Dy	4	3	2
Es	5	3	2
Gm	6	2	0
Gs	4	4	0
Ham	3	3	2
Kent	6	3	2
La	5	3	2
Le	6	2	1
Mx	4	1	1
Nr	3	1	1
Nt	4	1	0
Sm	5	3	2
Sx	5	1	1
Sy	6	3	1
Wk	5	2	2
Wo	6	2	1
Yor	5	3	1
	87	42	23

I've always been fascinated by the tables of runs/wicket in Wisden. So whilst the below excludes extras not credited to batsmen/bowlers, it's quite interesting....crucially, our wickets are more easily given away on average than we take from the opposition. It's not as significant as Northants (which almost looks like an error.....perhaps influenced by Luke Wood/Jamie Overton not being credited to them) but does hint at a general reverse so far.

	Bowling			Batting			
	Runs	wickets	Avg	Runs	Innings	Not Out	Avg
Dur	3798	145	26.19	3733	175	19	23.93
Dy	3731	138	27.04	4499	172	20	29.60
Es	3629	153	23.72	3425	139	19	28.54
Gm	4776	125	38.21	4881	151	24	38.43
Gs	3715	112	33.17	3144	125	17	29.11
Ham	3887	135	28.79	3967	152	20	30.05
Kent	4592	153	30.01	4155	171	22	27.89
La	3317	150	22.11	3259	125	21	31.34
Le	3691	110	33.55	3582	150	22	27.98
Mx	3366	104	32.37	3690	144	17	29.06
Nr	3578	96	37.27	4638	172	23	31.13
Nt	4281	111	38.57	3335	177	19	21.11
Sm	3790	187	20.27	3539	162	19	24.75
Sx	4577	139	32.93	4046	156	20	29.75
Sy	4101	141	29.09	4402	182	19	27.01
Wk	4724	154	30.68	3917	166	18	26.47
Wo	3643	128	28.46	2970	132	18	26.05
Yor	3830	130	29.46	4074	152	21	31.10

Sorry, only registered users may post in this forum.
We record all IP addresses on the Sportnetwork message boards which may be required by the authorities in case of defamatory or abusive comment. We seek to monitor the Message Boards at regular intervals. We do not associate Sportnetwork with any of the comments and do not take responsibility for any statements or opinions expressed on the Message Boards. If you have any cause for concern over any material posted here please let us know as soon as possible by e-mailing