vina wrote:Then why are there 9 serial numbers onlee, masquerading as important data?

Duh!!!. You were supposed to google / wiki for the "German Tank Problem" and ta-da, the answer pops out!

Nah, whats the fun in Googling for answers. Never heard of the German Tanki prablem but suspected it had something to do with population estimation hence the statement above . The problem with such problems is the naive inherent assumption of uniform distribution among deployed serial numbers and the fact that the sampling is representative, which doesn't always hold up in reality for sampling restricted to a short period of time. The usual estimation formula is this case is a function of the max. observed serial number and number of samples. The problem is the inherently high amount of weight given to the max. observed serial number and no weight is given to the values of the observed serial numbers.

In the specific case of the Asal Uttar battle, one has to know whether or not not all tanks have been committed to battle and pakis may not have deployed all tanks in the same time especially what is the sampling window. In this case, sampling window is governed by how the tanks got ambushed - there will be a bias due to the first 2 lines of tanks getting bogged down in the muddy fields of Punjab and ambushed by anti-tank gunners of IA, while rest of them on seeing such a debacle will probably downhill-ski paki style hence, there will be a skew in the observed serial numbers of tankis halaled so the usual MVUE formula in this case may not work depending on the skew. The skew itself will be determined as to how pakis chose tanks to use as lead scouts for cannon fodder and probing attacks - older ones with lower serial numbers then you will have a +ve skew and estimation using usual formula will fail, newer TFTA ghaazi ones with higher serial numbers then you will have a -ve skew but usual estimation will work becoz usual formula only cares for max. value and number of samples (not the value of samples themselves). If they had an ISI graduate who did SRSWOR across entire batch of tanks to choose the cannon fodder then also your usual estimation formula for population max. will work. But in reality, in case of attack on entrenched positions, it is the +ve skew scenario which might happen most often, throwing the estimates haywire and causing mucho pain since it will result in severe underestimation of Paki strength

One way to partially remedy the defect might be to include a weighted term which is a function of the skewness of serial numbers observed of halal'ed tanks and throw out the uniform distribution requirement. For example, If one observes a strong -ve skew and it is an attack on known enemy entrenched positions, then there is a higher likelihood of the actual number of tanks in inventory being much larger and the observed max. serial number being actually located close to the median or even below it (the bounds of the usual formula are [max.observed.serial.no, 2*max.observed.serial.no -1])

But in the end, we cannot depend on results of one battle especially with an enemy which doesn't control the production of tanks and only has a limited number of tanks at its disposal.

Thats why there are lies, damn lies and statistics . And a further reason why I have an inherent distrust of models (mathematical ones, not plastic or flesh ones).