Tag Simulation & Wargaming

Validating A Combat Model (Part VI)

Advancing Germans halted by 2nd Battalion, Fifth Marine, June 3 1918. Les Mares form 2 1/2 miles west of Belleau Wood attacked the American lines through the wheat fields. From a painting by Harvey Dunn. [U.S. Navy]

[The article below is reprinted from April 1997 edition of The International TNDM Newsletter.]

The First Test of the TNDM Battalion-Level Validations: Predicting the Winners
by Christopher A. Lawrence

CASE STUDIES: WHERE AND WHY THE MODEL FAILED CORRECT PREDICTIONS

World War I (12 cases):

Yvonne-Odette (Night)—On the first prediction, selected the defender as a winner, with the attacker making no advance. The force ratio was 0.5 to 1. The historical results also show e attacker making no advance, but rate the attacker’s mission accomplishment score as 6 while the defender is rated 4. Therefore, this battle was scored as a draw.

On the second run, the Germans (Sturmgruppe Grethe) were assigned a CEV of 1.9 relative to the US 9th Infantry Regiment. This produced a draw with no advance.

This appears to be a result that was corrected by assigning the CEV to the side that would be expected to have that advantage. There is also a problem in defining who is winner.

Hill 142—On the first prediction the defending Germans won, whereas in the real world the attacking Marines won. The Marines are recorded as having a higher CEV in a number of battles, so when this correction is put in the Marines win with a CEV of 1.5. This appears to be a case where the side that would be expected to have the higher CEV needed that CEV input into the combat rim to replicate historical results.

Note that while many people would expect the Germans to have the higher CEV, at this juncture in WWI the German regular army was becoming demoralized, while the US Army was highly motivated, trained and fresh. While l did not initially expect to see a superior CEV for the US Marines, when l did see it l was not surprised. I also was not surprised to note that the US Army had a lower CEV than the Marine Corps or that the German Sturmgruppe Grethe had a higher CEV than the US side. As shown in the charts below, the US Marines’ CEV is usually higher than the German CEV for the engagements of Belleau Wood, although this result is not very consistent in value. But this higher value does track with Marine Corps legend. l personally do not have sufficient expertise on WWI to confirm or deny the validity of the legend.

West Wood I—0n the first prediction the model rated the battle a draw with minimal advance (0.265 km) for the attacker, whereas historically the attackers were stopped cold with a bloody repulse. The second run predicted a very high CEV of 2.3 for the Germans, who stopped the attackers with a bloody repulse. The results are not easily explainable.

Bouresches I (Night)—On the first prediction the model recorded an attacker victory with an advance of 0.5 kilometer. Historically, the battle was a draw with an attacker advance of one kilometer. The attacker’s mission accomplishment score was 5, while the defender’s was 6. Historically, this battle could also have been considered an attacker victory. A second run with an increased German CEV to 1.5 records it as a draw with no advance. This appears to be a problem in defining who is the winner.

West Wood II—On the first run, the model predicted a draw with an advance of 0.3 kilometers. Historically, the attackers won and advanced 1.6 kilometers. A second run with a US CEV of 1.4 produced a clear attacker victory. This appears to be a case where the side that would be expected to have the higher CEV needed that CEV input into the combat run.

North Woods I—On the first prediction, the model records the defender winning, while historically the attacker won. A second run with a US CEV of 1.5 produced a clear attacker victory. This appears to be a case where the side that would be expected to have the higher CEV needed that CEV input into the combat run.

Chaudun—On the first prediction, the model predicted the defender winning when historically, the attacker clearly won. A second run with an outrageously high US CEV of 2.5 produced a clear attacker victory. The results are not easily explainable.

Medeah Farm—On the first prediction, the model recorded the defender as winning when historically the attacker won with high casualties. The battle consists of a small number of German defenders with lots of artillery defending against a large number of US attackers with little artillery. On the second run, even with a US CEV of 1.6, the German defender won. The model was unable to select a CEV that would get a correct final result yet reflect the correct casualties. The model is clearly having a problem with this engagement.

Exermont—On the first prediction, the model recorded the defender as winning when historically, the attacker did, with both the attackers and the defender’s mission accomplishment scores being rated at 5. The model did rate the defender‘s casualties too high, so when it calculated what the CEV should be, it gave the defender a higher CEV so that it could bring down the defenders losses relative to the attackers. Otherwise, this is a normal battle. The second prediction was no better. The model is clearly having a problem with this engagement due to the low defender casualties.

Mayache Ravine—The model predicted the winner (the attacker) correctly on the first run, with the attacker having an opposed advance of 0.8 kilometer. Historically, the attacker had an opposed rate of advance of 1.3 kilometers. Both sides had a mission accomplishment score of 5. The problem is that the model predicted higher defender casualties than the attacker, while in the actual battle the defender had lower casualties that the attacker. On the second run, therefore, the model put in a German CEV of 1.5, which resulted in a draw with the attacker advancing 0.3 kilometers. This brought the casualty estimates more in line, but turned a successful win/loss prediction into one that was “off by one.” The model is clearly having a problem with this engagement due to the low defender casualties.

La Neuville—The model also predicted the winner (the attacker) correctly here, with the attacker advancing 0.5 kilometer. In the historical battle they advanced 1.6 kilometers. But again, the model predicted lower attacker losses than the defender losses, while in the actual battle the defender losses were much lower than the attacker losses. So, again on the second run, the model gave the defender (the Germans) a CEV of 1.4, which turned an accurate win/loss prediction into an inaccurate one. It still didn’t do a very good job on the casualties. The model is clearly having a problem with this engagement due to the low defender casualties.

Hill 252—On the first run, the model predicts a draw with a distanced advanced of 0.2 km, while the real battle was an attacker victory with an advance of 2.9 kilometers. The model’s casualty predictions are quite good. On the second run, the model correctly predicted an attacker win with a US CEV of 1.5. The distance advanced increases to 0.6 kilometer, while the casualty prediction degrades noticeably. The model is having some problems with this engagement that are not really explainable, but the results are not far off the mark.

Next: WWII Cases

Validating A Combat Model (Part V)

[The article below is reprinted from April 1997 edition of The International TNDM Newsletter.]

The First Test of the TNDM Battalion-Level Validations: Predicting the Winners
by Christopher A. Lawrence

Part II

CONCLUSIONS:

WWI (12 cases):

For the WWI battles, the nature of the prediction problems are summarized as:

CONCLUSION: In the case of the WWI runs, five of the problem engagements were due to confusion of defining a winner or a clear CEV existing for a side that should have been predictable. Seven out of the 23 runs have some problems, with three problems resolving themselves by assigning a CEV value to a side that may not have deserved it. One (Medeah Farm) was just off any way you look at it, and three suffered a problems because historically the defenders (Germans) suffered surprisingly low losses. Two had the battle outcome predicted correctly on the first run, and then had the outcome incorrectly predicted after a CEV was assigned.

With 5 to 7 clear failures (depending on how you count them), this leads one to conclude that the TNDM can be relied upon to predict the winner in a WWI battalion-level battle in about 70% of the cases.

WWII (8 cases):

For the WWII battles, the nature of the prediction problems are summarized as:

CONCLUSION: In the case of the WWII runs, three of the problem engagements were due to confusion of defining a winner or a clear CEV existing for a side that should have been predictable. Four out of the 23 runs suffered a problem because historically the defenders (Germans) suffered surprisingly low losses and one case just simply assigned a possible unjustifiable CEV. This led to the battle outcome being predicted correctly on the first run, then incorrectly predicted after CEV was assigned.

With 3 to 5 clear failures, one can conclude that the TNDM can be relied upon to predict the winner in a WWII battalion-level battle in about 80% of the cases.

Modern (8 cases):

For the post-WWll battles, the nature of the prediction problems are summarized as:

CONCLUSION: ln the case of the modem runs, only one result was a problem. In the other seven cases, when the force with superior training is given a reasonable CEV (usually around 2), then the correct outcome is achieved. With only one clear failure, one can conclude that the TNDM can be relied upon to predict the winner in a modern battalion-level battle in over 90% of the cases.

FINAL CONCLUSIONS: In this article, the predictive ability of the model was examined only for its ability to predict the winner/loser. We did not look at the accuracy of the casualty predictions or the accuracy of the rates of advance. That will be done in the next two articles. Nonetheless, we could not help but notice some trends.

First and foremost, while the model was expected to be a reasonably good predictor of WWII combat, it did even better for modem combat. It was noticeably weaker for WWI combat. In the case of the WWI data, all attrition figures were multiplied by 4 ahead of time because we knew that there would be a fit problem otherwise.

This would strongly imply that there were more significant changes to warfare between 1918 and 1939 than between 1939 and 1989.

Secondly, the model is a pretty good predictor of winner and loser in WWII and modern cases. Overall, the model predicted the winner in 68% of the cases on the first run and in 84% of the cases in the run incorporating CEV. While its predictive powers were not perfect, there were 13 cases where it just wasn’t getting a good result (17%). Over half of these were from WWI, only one from the modern period.

In some of these battles it was pretty obvious who was going to win. Therefore, the model needed to do a step better than 50% to be even considered. Historically, in 51 out of 76 cases (67%). the larger side in the battle was the winner. One could predict the winner/loser with a reasonable degree of success by just looking at that rule. But the percentage of the time the larger side won varied widely with the period. In WWI the larger side won 74% of the time. In WWII it was 87%. In the modern period it was a counter-intuitive 47% of the time, yet the model was best at selecting the winner in the modern period.

The model’s ability to predict WWI battles is still questionable. It obviously does a pretty good job with WWII battles and appears to be doing an excellent job in the modern period. We suspect that the difference in prediction rates between WWII and the modern period is caused by the selection of battles, not by any inherit ability of the model.

RECOMMENDED CHANGES: While it is too early to settle upon a model improvement program, just looking at the problems of winning and losing, and the ancillary data to that, leads me to three corrections:

  1. Adjust for times of less than 24 hours. Create a formula so that battles of six hours in length are not 1/4 the casualties of a 24-hour battle, but something greater than that (possibly the square root of time). This adjustment should affect both casualties and advance rates.
  2. Adjust advance rates for smaller unit: to account for the fact that smaller units move faster than larger units.
  3. Adjust for fanaticism to account for those armies that continue to fight after most people would have accepted the result, driving up casualties for both sides.

Next Part III: Case Studies

Validating A Combat Model (Part IV)

[The article below is reprinted from April 1997 edition of The International TNDM Newsletter.]

The First Test of the TNDM Battalion-Level Validations: Predicting the Winners
by Christopher A. Lawrence

Part I

In the basic concept of the TNDM battalion-level validation, we decided to collect data from battles from three periods: WWI, WWII, and post-WWII. We then made a TNDM run for each battle exactly as the battle was laid out, with both sides having the same CEV [Combat Effectiveness Value]. The results of that run indicated what the CEV should have been for the battle, and we then made a second run using that CEV. That was all we did. We wanted to make sure that there was no “tweaking” of the model for the validation, so we stuck rigidly to this procedure. We then evaluated each run for its fit in three areas:

  1. Predicting the winner/loser
  2. Predicting the casualties
  3. Predicting the advance rate

We did end up changing two engagements around. We had a similar situation with one WWII engagement (Tenaru River) and one modern period engagement (Bir Gifgafa), where the defender received reinforcements part-way through the battle and counterattacked. In both cases we decided to run them as two separate battles (adding two more battles to our database), with the conditions from the first engagement being the starting strength, plus the reinforcements, for the second engagement. Based on our previous experience with running Goose Green, for all the Falklands Island battles we counted the Milans and Carl Gustavs as infantry weapons. That is the only “tweaking” we did that affected the battle outcome in the model. We also put in a casualty multiplier of 4 for WWI engagements, but that is discussed in the article on casualties.

This is the analysis of the first test, predicting the winner/loser. Basically, if the attacker won historically, we assigned it a value of 1, a draw was 0, and a defender win was -1. In the TNDM results summary, it has a column called “winner” which records either an attacker win, a draw, or a defender win. We compared these two results. If they were the same, this is a “correct” result. If they are “off by one,” this means the model predicted an attacker win or loss, where the actual result was a draw, or the model predicted a draw, where the actual result was a win or loss. If they are “off by two” then the model simply missed and predicted the wrong winner.

The results are (the envelope please….):

It is hard to determine a good predictability from a bad one. Obviously, the initial WWI prediction of 57% right is not very good, while the Modern second run result of 97% is quite good. What l would really like to do is compare these outputs to some other model (like TACWAR) to see if they get a closer fit. I have reason to believe that they will not do better.

Most cases in which the model was “off by 1″ were easily correctable by accounting for the different personnel capabilities of the army. Therefore, just to look where the model really failed. let‘s just look at where it simply got the wrong winner:

The TNDM is not designed or tested for WWI battles. It is basically designed to predict combat between 1939 and the present. The total percentages without the WWI data in it are:

Overall, based upon this data I would be willing to claim that the model can predict the correct winner 75% of the time without accounting for human factors and 90% of the time if it does.

CEVs: Quite simply a user of the TNDM must develop a CEV to get a good prediction. In this particular case, the CEVs were developed from the first run. This means that in the second run, the numbers have been juggled (by changing the CEV) to get a better result. This would make this effort meaningless if the CEVs were not fairly consistent over several engagements for one side versus its other side. Therefore, they are listed below in broad groupings so that the reader can determine if the CEVs appear to be basically valid or are simply being used as a “tweak.”

Now, let’s look where it went wrong. The following battles were not predicted correctly:

There are 19 night engagements in the data base, five from WWI, three from WWII, and 11 modern. We looked at whether the miss prediction was clustered among night engagements and that did not seem to be the case. Unable to find a pattern, we examined each engagement to see what the problem was. See the attachments at the end of this article for details.

We did obtain CEVs that showed some consistency. These are shown below. The Marines in World War l record the following CEVs in these WWI battles:

Compare those figures to the performance of the US Army:

In the above two and in all following cases, the italicized battles are the ones with which we had prediction problems.

For comparison purposes, the CEVs were recorded in the battles in World War II between the US and Japan:

For comparison purposes, the following CEVs were recorded in Operation Veritable:

These are the other engagements versus Germans for which CEVs were recorded:

For comparison purposes, the following CEVs were recorded in the post-WWII battles between Vietnamese forces and their opponents:

Note that the Americans have an average CEV advantage of 1 .6 over the NVA (only three cases) while having a 1.8 advantage over the VC (6 cases).

For comparison purposes, the following CEVs were recorded in the battles between the British and Argentine’s:

Next: Part II: Conclusions

Validating A Combat Model (Part III)

[The article below is reprinted from April 1997 edition of The International TNDM Newsletter.]

Numerical Adjustment of CEV Results: Averages and Means
by Christopher A. Lawrence and David L. Bongard

As part of the battalion-level validation effort, we made two runs with the model for each test case—one without CEV [Combat Effectiveness Value] incorporated and one with the CEV incorporated. The printout of a TNDM [Tactical Numerical Deterministic Model] run has three CEV figures for each side: CEVt CEVl and CEVad. CEVt shows the CEV as calculated on the basis of battlefield results as a ratio of the performance of side a versus side b. It measures performance based upon three factors: mission accomplishment, advance, and casualty effectiveness. CEVt is calculated according to the following formula:

P′ = Refined Combat Power Ratio (sum of the modified OLls). The ′ in P′ indicates that this ratio has been “refined” (modified) by two behavioral values already: the factor for Surprise and the Set Piece Factor.

CEVd = 1/CEVa (the reciprocal)

In effect the formula is relative results multiplied by the modified combat power ratio. This is basically the formulation that was used for the QJM [Quantified Judgement Model].

In the TNDM Manual, there is an alternate CEV method based upon comparative effective lethality. This methodology has the advantage that the user doesn’t have to evaluate mission accomplishment on a ten point scale. The CEVI calculated according to the following formula:

In effect, CEVt is a measurement of the difference in results predicted by the model from actual historical results based upon assessment for three different factors (mission success, advance rates, and casualties), while CEVl is a measurement of the difference in predicted casualties from actual casualties. The CEVt and the CEVl of the defender is the reciprocal of the one for the attacker.

Now the problem comes in when one creates the CEVad, which is the average of the two CEVs above. l simply do not know why it was decided to create an alternate CEV calculation from the old QJM method, and then average the two, but this is what is currently being done in the model. This averaging results in a revised CEV for the attacker and for the defender that are not reciprocals of each other, unless the CEVt and the CEVl were the same. We even have some cases where both sides had a CEVad of greater than one. Also, by averaging the two, we have heavily weighted casualty effectiveness relative to mission effectiveness and mission accomplishment.

What was done in these cases (again based more on TDI tradition or habit, and not on any specific rule) was:

(1.) If CEVad are reciprocals, then use as is.

(2.) If one CEV is greater than one while the other is less than 1,  then add the higher CEV to the value of the reciprocal of the lower CEV (1/x) and divide by two. This result is the CEV for the superior force, and its reciprocal is the CEV for the inferior force.

(3.) If both CEVs are above zero, then we divide the larger CEVad value by the smaller, and use its result as the superior force’s CEV.

In the case of (3.) above, this methodology usually results in a slightly higher CEV for the attacker side than if we used the average of the reciprocal (usually 0.1 or 0.2 higher). While the mathematical and logical consistency of the procedure bothered me, the logic for the different procedure in (3.) was that the model was clearly having a problem with predicting the engagement to start with, but that in most cases when this happened before (meaning before the validation), a higher CEV usually produced a better fit than a lower one. As this is what was done before. I accepted it as is, especially if one looks at the example of Mediah Farm. If one averages the reciprocal with the US’s CEV of 8.065, one would get a CEV of 4.13. By the methodology in (3.), one comes up with a more reasonable US CEV of 1.58.

The interesting aspect is that the TNDM rules manual explains how CEVt, CEVl and CEVad are calculated, but never is it explained which CEVad (attacker or defender) should be used. This is the first explanation of this process, and was based upon the “traditions” used at TDI. There is a strong argument to merge the two CEVs into one formulation. I am open to another methodology for calculating CEV. I am not satisfied with how CEV is calculated in the TNDM and intend to look into this further. Expect another article on this subject in the next issue.

Validating A Combat Model (Part II)

[The article below is reprinted from October 1996 edition of The International TNDM Newsletter.]

Validation of the TNDM at Battalion Level
by Christopher A. Lawrence

The original QJM (Quantified Judgement Model) was created and validated using primarily division-level engagements from WWII and the 1967 and 1973 Mid-East Wars. For a number of reasons, we are now using the TNDM (Tactical Numerical Deterministic Model) for analyzing lower-level engagements. We expect, with the changed environment in the world, this trend to continue.

The model, while designed to handle battalion-level engagements, was never validated for those size engagements. There were only 16 engagements in the original QJM Database with less than 5,000 people on one side, and only one with less than 2,000 people on a side. The sixteen smallest engagements are:

While it is not unusual in the operations research community to use unvalidated models of combat, it is a very poor practice. As TDI is starting to use this model for battalion-level engagements, it is time it was formally validated for that use. A model that is validated at one level of combat is not validated to represent sizes, types and forms of combat to which it has not been tested. TDI is undertaking a battalion-level validation effort for the TNDM. We intend to publish the material used and the results of the validation in the International TNDM Newsletter. As part of this battalion-level validation we will also be looking at a number of company-level engagements. Right now, my intention is to simply just throw all the engagements into the same hopper and see what comes out.

By battalion-level, I mean any operation consisting of the equivalent of two or less reinforced battalions on one side. Three or more battalions imply a regiment or brigade—level operation. A battalion in combat can range widely in strength, but that usually does not have an authorized strength in excess of 900. Therefore, the upper limit for a battalion—level engagement is 2,000 people, while its lower limit can easily go below 500 people. Only one engagement in the original OJM Database fits that definition of a battalion-level engagement. HERO, DMSI, TND & Associates, and TDI (all companies founded by Trevor N. Dupuy) examined a number of small engagements over the years. HERO assembled 23 WWI engagements for the Land Warfare Database (LWDB), TDI has done 15 WWII small unit actions for the Suppression contract and Dave Bongard has assembled four others from that period for the Pacific, DMSI did 14 battalion-level engagements from Vietnam for a study on low intensity conflict 10 years ago, and Dave Bongard has been independently looking into the Falkland Islands War and other post-WWII sources to locate 10 more engagements, and we have three engagements that Trevor N. Dupuy did for South Africa. We added two other World War II engagements and the three smallest engagements from the list to the left (those marked with an asterisk). This gives us a list of 74 additional engagements that can be used to test the TNDM.

The smallest of these engagements is 220 people on both sides (100 vs I20), while the largest engagement on this list is 5,336 versus 3,270 or 8,679 vs 725. These 74 engagements consist of 23 engagements from WWI, 22 from WWII, and 29 post-1945 engagements. There are three engagements where both sides have over 3,000 men and 3 more where both sides are above 2,000 men. In the other 68 engagements, at least one side is below 2,000, while in 50 of the engagements, both sides are below 2,000.

This leaves the following force sizes to be tested:

These engagements have been “randomly” selected in the sense that the researchers grabbed whatever had been done and whatever else was conveniently available. It is not a proper random selection, in the sense that every war in this century was analyzed and a representative number of engagements was taken from each conflict. This is not practical, so we settle for less than perfect data selection.

Furthermore, as many of these conflicts are with countries that do not have open archives (and in many cases limited unit records) some of the opposing forces strength and losses had to be estimated. This is especially true with the Viet Nam engagements. It is hoped that the errors in estimation deviate equally on both sides of the norm, but there is no way of knowing that until countries like the People’s Republic of China and Vietnam open up their archives for free independent research.

TDI intends to continue to look for battalion-level and smaller engagements for analysis, and may add to this data base over time. If some of our readers have any other data assembled, we would be interested in seeing it. In the next issue we will publish the preliminary results of our validation.

Note that in the above table, for World War II, German, Japanese, and Axis forces are listed in italics, while US, British, and Allied forces are listed in regular typeface, Also, in the VERITABLE engagements, the 5/7th Gordons’ action continues the assault of the 7th Black Watch, and that the 9th Cameronians assumed the attack begun by the 2d Gordon Highlanders.

Tu-Vu is described in some detail in Fall’s Street Without Joy (pp. 51-53). The remaining Indochina/SE Asia engagements listed here are drawn from a QJM-based analysis of low-intensity operations (HERO Report 124, Feb 1988).

The coding for source and validation status, on the extreme right of each engagement line in the D Cas column, is as follows:

  • n indicates an engagement which has not been employed for validation, but for which good data exists for both sides (35 total).
  • Q indicates an engagement which was part of the original QJM database (3 total).
  • Q+ indicates an engagement which was analyzed as part of the QJM low-intensity combat study in 1988 (14 total).
  • T indicates an engagement analyzed with the TNDM (20 total).

The Origins Of The U.S. Army’s Concept Of Combat Power

The U.S. Army’s concept of combat power can be traced back to the thinking of British theorist J.F.C. Fuller, who collected his lectures and thoughts into the book, The Foundations of the Science of War (1926).

In a previous post, I critiqued the existing U.S. Army doctrinal method for calculating combat power. The ideas associated with the term “combat power” have been a part of U.S Army doctrine since the 1920s. However, the Army did not specifically define what combat power actually meant until the 1982 edition of FM 100-5 Operations, which introduced the AirLand Battle concept. So where did the Army’s notion of the concept originate? This post will trace the way it has been addressed in the capstone Field Manual (FM) 100-5 Operations series.

As then-U.S. Army Major David Boslego explained in a 1995 School of Advanced Military Studies (SAMS) thesis[1], the Army’s original idea of combat power most likely derived from the work of British military theorist J.F.C. Fuller. In the late 1910s and early 1920s, Fuller articulated the first modern definitions of the principles of war, which he developed from his conception of force on the battlefield as something more than just the tangible effects of shock and firepower. Fuller’s principles were adopted in the 1920 edition of the British Army Field Service Regulations (FSR), which was the likely vector of influence on the U.S. Army’s 1923 FSR. While the term “combat power” does not appear in the 1923 FSR, the influence of Fullerian thinking is evident.

The first use of the phrase itself by the Army can be found in the 1939 edition of FM 100-5 Tentative Field Service Regulations, Operations, which replaced and updated the 1923 FSR. It appears just twice and was not explicitly defined in the text. As Boslego noted, however, even then the use of the term

highlighted a holistic view of combat power. This power was the sum of all factors which ultimately affected the ability of the soldiers to accomplish the mission. Interestingly, the authors of the 1939 edition did not focus solely on the physical objective of destroying the enemy. Instead, they sought to break the enemy’s power of resistance which connotes moral as well as physical factors.

This basic, implied definition of combat power as a combination of interconnected tangible physical and intangible moral factors could be found in all successive editions of FM 100-5 through 1968. The type and character of the factors comprising combat power evolved along with the Army’s experience of combat through this period, however. In addition to leadership, mobility, and firepower, the 1941 edition of FM 100-5 included “better armaments and equipment,” which reflected the Army’s initial impressions of the early “blitzkrieg” battles of World War II.

From World War II Through Korea

While FM 100-5 (1944) and  FM 100-5 (1949) made no real changes with respect to describing combat power, the 1954 edition introduced significant new ideas in the wake of major combat operations in Korea, albeit still without actually defining the term. As with its predecessors, FM 100-5 (1954) posited combat power as a combination of firepower, maneuver, and leadership. For the first time, it defined the principles of mass, unity of command, maneuver, and surprise in terms of combat power. It linked the principle of the offensive, “only offensive action achieves decisive results,” with the enduring dictum that “offensive action requires the concentration of superior combat power at the decisive point and time.”

Boslego credited the authors of FM 100-5 (1954) with recognizing the non-linear nature of warfare and advising commanders to take a holistic perspective. He observed that they introduced the subtle but important understanding of combat power not as a fixed value, but as something relative and interactive between two forces in battle. Any calculation of combat power would be valid only in relation to the opposing combat force. “Relative combat power is dynamic and can be directly influenced by opposing commanders. It therefore must be analyzed by the commander in its potential relation to all other factors.” One of the fundamental ways a commander could shift the balance of combat power against an enemy was through maneuver: “Maneuver must be used to alter the relative combat power of military forces.”

[As I mentioned in a previous post, Trevor Dupuy considered FM 100-5 (1954)’s list and definitions of the principles of war to be the best version.]

Into the “Pentomic Era”

The 1962 edition of FM 100-5 supplied a general definition of combat power that articulated the way the Army had been thinking about it since 1939.

Combat power is a combination of the physical means available to a commander and the moral strength of his command. It is significant only in relation to the combat power of the opposing forces. In applying the principles of war, the development and application of combat power are essential to decisive results.

It further refined the elements of combat power by redefining the principles of economy of force and security in terms of it as well.

By the early 1960s, however, the Army’s thinking about force on the battlefield was dominated by the prospect of the use of nuclear weapons. As Boslego noted, both FM 100-5 (1962) and FM 100-5 (1968)

dwelt heavily on the importance of dispersing forces to prevent major losses from a single nuclear strike, being highly mobile to mass at decisive points and being flexible in adjusting forces to the current situation. The terms dispersion, flexibility, and mobility were repeated so frequently in speeches, articles, and congressional testimony, that…they became a mantra. As a result, there was a lack of rigor in the Army concerning what they meant in general and how they would be applied on the tactical battlefield in particular.

The only change the 1968 edition made was to expand the elements of combat power to include “firepower, mobility, communications, condition of equipment, and status of supply,” which presaged an increasing focus on the technological aspects of combat and warfare.

The first major modification in the way the Army thought about combat power since before World War II was reflected in FM 100-5 (1976). These changes in turn prompted a significant reevaluation of the concept by then-U.S. Army Major Huba Wass de Czege. I will tackle how this resulted in the way combat power was redefined in the 1982 edition of FM 100-5 in a future post.

Notes

[1] David V. Boslego, “The Relationship of Information to the Relative Combat Power Model in Force XXI Engagements,” School of Advanced Military Studies Monograph, U.S. Army Command and General Staff College, Fort Leavenworth, Kansas, 1995.

Spotted In The New Books Section Of The U.S. Naval Academy Library…

Christopher A. Lawrence, War by Numbers: Understanding Conventional Combat (Lincoln, NE: Potomac Books, 2017) 390 pages, $39.95

War by Numbers assesses the nature of conventional warfare through the analysis of historical combat. Christopher A. Lawrence (President and Executive Director of The Dupuy Institute) establishes what we know about conventional combat and why we know it. By demonstrating the impact a variety of factors have on combat he moves such analysis beyond the work of Carl von Clausewitz and into modern data and interpretation.

Using vast data sets, Lawrence examines force ratios, the human factor in case studies from World War II and beyond, the combat value of superior situational awareness, and the effects of dispersion, among other elements. Lawrence challenges existing interpretations of conventional warfare and shows how such combat should be conducted in the future, simultaneously broadening our understanding of what it means to fight wars by the numbers.

The book is available in paperback directly from Potomac Books and in paperback and Kindle from Amazon.

How Does the U.S. Army Calculate Combat Power? ¯\_(ツ)_/¯

The constituents of combat power as described in current U.S. military doctrine. [The Lightning Press]

One of the fundamental concepts of U.S. warfighting doctrine is combat power. The current U.S. Army definition is “the total means of destructive, constructive, and information capabilities that a military unit or formation can apply at a given time. (ADRP 3-0).” It is the construct commanders and staffs are taught to use to assess the relative effectiveness of combat forces and is woven deeply throughout all aspects of U.S. operational thinking.

To execute operations, commanders conceptualize capabilities in terms of combat power. Combat power has eight elements: leadership, information, mission command, movement and maneuver, intelligence, fires, sustainment, and protection. The Army collectively describes the last six elements as the warfighting functions. Commanders apply combat power through the warfighting functions using leadership and information. [ADP 3-0, Operations]

Yet, there is no formal method in U.S. doctrine for estimating combat power. The existing process is intentionally subjective and largely left up to judgment. This is problematic, given that assessing the relative combat power of friendly and opposing forces on the battlefield is the first step in Course of Action (COA) development, which is at the heart of the U.S. Military Decision-Making Process (MDMP). Estimates of combat power also figure heavily in determining the outcomes of wargames evaluating proposed COAs.

The Existing Process

The Army’s current approach to combat power estimation is outlined in Field Manual (FM) 6-0 Commander and Staff Organization and Operations (2014). Planners are instructed to “make a rough estimate of force ratios of maneuver units two levels below their echelon.” They are then directed to “compare friendly strengths against enemy weaknesses, and vice versa, for each element of combat power.” It is “by analyzing force ratios and determining and comparing each force’s strengths and weaknesses as a function of combat power” that planners gain insight into tactical and operational capabilities, perspectives, vulnerabilities, and required resources.

That is it. Planners are told that “although the process uses some numerical relationships, the estimate is largely subjective. Assessing combat power requires assessing both tangible and intangible factors, such as morale and levels of training.” There is no guidance as to how to determine force ratios [numbers of troops or weapons systems?]. Nor is there any description of how to relate force calculations to combat power. Should force strengths be used somehow to determine a combat power value? Who knows? No additional doctrinal or planning references are provided.

Planners then use these subjective combat power assessments as they shape potential COAs and test them through wargaming. Although explicitly warned not to “develop and recommend COAs based solely on mathematical analysis of force ratios,” they are invited at this stage to consult a table of “minimum historical planning ratios as a starting point.” The table is clearly derived from the ubiquitous 3-1 rule of combat. Contrary to what FM 6-0 claims, neither the 3-1 rule nor the table have a clear historical provenance or any sort of empirical substantiation. There is no proven validity to any of the values cited. It is not even clear whether the “historical planning ratios” apply to manpower, firepower, or combat power.

During this phase, planners are advised to account for “factors that are difficult to gauge, such as impact of past engagements, quality of leaders, morale, maintenance of equipment, and time in position. Levels of electronic warfare support, fire support, close air support, civilian support, and many other factors also affect arraying forces.” FM 6-0 offers no detail as to how these factors should be measured or applied, however.

FM 6-0 also addresses combat power assessment for stability and civil support operations through troop-to-task analysis. Force requirements are to be based on an estimate of troop density, a “ratio of security forces (including host-nation military and police forces as well as foreign counterinsurgents) to inhabitants.” The manual advises that most “most density recommendations fall within a range of 20 to 25 counterinsurgents for every 1,000 residents in an area of operations. A ratio of twenty counterinsurgents per 1,000 residents is often considered the minimum troop density required for effective counterinsurgency operations.”

While FM 6-0 acknowledges that “as with any fixed ratio, such calculations strongly depend on the situation,” it does not mention that any references to force level requirements, tie-down ratios, or troop density were stripped from both Joint and Army counterinsurgency manuals in 2013 and 2014. Yet, this construct lingers on in official staff planning doctrine. (Recent research challenged the validity of the troop density construct but the Defense Department has yet to fund any follow-on work on the subject.)

The Army Has Known About The Problem For A Long Time

The Army has tried several solutions to the problem of combat power estimation over the years. In the early 1970s, the U.S. Army Center for Army Analysis (CAA; known then as the U.S. Army Concepts & Analysis Agency) developed the Weighted Equipment Indices/Weighted Unit Value (WEI/WUV or “wee‑wuv”) methodology for calculating the relative firepower of different combat units. While WEI/WUV’s were soon adopted throughout the Defense Department, the subjective nature of the method gradually led it to be abandoned for official use.

In the 1980s and 1990s, the U.S. Army Command & General Staff College (CGSC) published the ST 100-9 and ST 100-3 student workbooks that contained tables of planning factors that became the informal basis for calculating combat power in staff practice. The STs were revised regularly and then adapted into spreadsheet format in the late 1990s. The 1999 iteration employed WEI/WEVs as the basis for calculating firepower scores used to estimate force ratios. CGSC stopped updating the STs in the early 2000s, as the Army focused on irregular warfare.

With the recently renewed focus on conventional conflict, Army staff planners are starting to realize that their planning factors are out of date. In an attempt to fill this gap, CGSC developed a new spreadsheet tool in 2012 called the Correlation of Forces (COF) calculator. It apparently drew upon analysis done by the U.S. Army Training and Doctrine Command Analysis Center (TRAC) in 2004 to establish new combat unit firepower scores. (TRAC’s methodology is not clear, but if it is based on this 2007 ISMOR presentation, the scores are derived from runs by an unspecified combat model modified by factors derived from the Army’s unit readiness methodology. If described accurately, this would not be an improvement over WEI/WUVs.)

The COF calculator continues to use the 3-1 force ratio tables. It also incorporates a table for estimating combat losses based on force ratios (this despite ample empirical historical analysis showing that there is no correlation between force ratios and casualty rates).

While the COF calculator is not yet an official doctrinal product, CGSC plans to add Marine Corps forces to it for use as a joint planning tool and to incorporate it into the Army’s Command Post of the Future (CPOF). TRAC is developing a stand-alone version for use by force developers.

The incorporation of unsubstantiated and unvalidated concepts into Army doctrine has been a long standing problem. In 1976, Huba Wass de Czege, then an Army major, took both “loosely structured and unscientific analysis” based on intuition and experience and simple counts of gross numbers to task as insufficient “for a clear and rigorous understanding of combat power in a modern context.” He proposed replacing it with a analytical framework for analyzing combat power that accounted for both measurable and intangible factors. Adopting a scrupulous method and language would overcome the simplistic tactical analysis then being taught. While some of the essence of Wass de Czege’s approach has found its way into doctrinal thinking, his criticism of the lack of objective and thorough analysis continues to echo (here, here, and here, for example).

Despite dissatisfaction with the existing methods, little has changed. The problem with this should be self-evident, but I will give the U.S. Naval War College the final word here:

Fundamentally, all of our approaches to force-on-force analysis are underpinned by theories of combat that include both how combat works and what matters most in determining the outcomes of engagements, battles, campaigns, and wars. The various analytical methods we use can shed light on the performance of the force alternatives only to the extent our theories of combat are valid. If our theories are flawed, our analytical results are likely to be equally wrong.