Tag Lanchester equations

TDI Friday Read: Engaging The Phalanx

The December 2018 issue of Phalanx, a periodical journal published by The Military Operations Research Society (MORS), contains an article by Jonathan K. Alt, Christopher Morey, and Larry Larimer, entitled “Perspectives on Combat Modeling.” (the article is paywalled, but limited public access is available via JSTOR).

Their article was written partly as a critical rebuttal to a TDI blog post originally published in April 2017, which discussed an issue of which the combat modeling and simulation community has long been aware but slow to address, known as the “Base of Sand” problem.

Wargaming Multi-Domain Battle: The Base Of Sand Problem

In short, because so little is empirically known about the real-world structures of combat processes and the interactions of these processes, modelers have been forced to rely on the judgement of subject matter experts (SMEs) to fill in the blanks. No one really knows if the blend of empirical data and SME judgement accurately represents combat because the modeling community has been reluctant to test its models against data on real world experience, a process known as validation.

TDI President Chris Lawrence subsequently published a series of blog posts responding to the specific comments and criticisms leveled by Alt, Morey, and Larimer.

How are combat models and simulations tested to see if they portray real-world combat accurately? Are they actually tested?

Engaging the Phalanx

How can we know if combat simulations adhere to strict standards established by the DoD regarding validation? Perhaps the validation reports can be released for peer review.

Validation

Some claim that models of complex combat behavior cannot really be tested against real-world operational experience, but this has already been done. Several times.

Validating Attrition

If only the “physics-based aspects” of combat models are empirically tested, do those models reliably represent real-world combat with humans or only the interactions of weapons systems?

Physics-based Aspects of Combat

Is real-world historical operational combat experience useful only for demonstrating the capabilities of combat models, or is it something the models should be able to reliably replicate?

Historical Demonstrations?

If a Subject Matter Expert (SME) can be substituted for a proper combat model validation effort, then could not a SME simply be substituted for the model? Should not all models be considered expert judgement quantified?

SMEs

What should be done about the “Base of Sand” problem? Here are some suggestions.

Engaging the Phalanx (part 7 of 7)

Persuading the military operations research community of the importance of research on real-world combat experience in modeling has been an uphill battle with a long history.

Diddlysquat

And the debate continues…

Validation

Continuing to comment on the article in the December 2018 issue of the Phalanx by Jonathan Alt, Christopher Morey and Larry Larimer (this is part 2 of 7; see part 1 here).

On the first page (page 28) top of the third column they make the rather declarative statement that:

The combat simulations used by military operations research and analysis agencies adhere to strict standards established by the DoD regarding verification, validation and accreditation (Department of Defense, 2009).

Now, I have not reviewed what has been done on verification, validation and accreditation since 2009, but I did do a few fairly exhaustive reviews before then. One such review is written up in depth in The International TNDM Newsletter. It is Volume 1, No. 4 (February 1997). You can find it here:

http://www.dupuyinstitute.org/tdipub4.htm

The newsletter includes a letter dated 21 January 1997 from the Scientific Advisor to the CG (Commanding General)  at TRADOC (Training and Doctrine Command). This is the same organization that the three gentlemen who wrote the article in the Phalanx work for. The Scientific Advisor sent a letter out to multiple commands to try to flag the issue of validation (letter is on page 6 of the newsletter). My understanding is that he received few responses (I saw only one, it was from Leavenworth). After that, I gather there was no further action taken. This was a while back, so maybe everything has changed, as I gather they are claiming with that declarative statement. I doubt it.

This issue to me is validation. Verification is often done. Actual validations are a lot rarer. In 1997, this was my list of combat models in the industry that had been validated (the list is on page 7 of the newsletter):

1. Atlas (using 1940 Campaign in the West)

2. Vector (using undocumented turning runs)

3. QJM (by HERO using WWII and Middle-East data)

4. CEM (by CAA using Ardennes Data Base)

5. SIMNET/JANUS (by IDA using 73 Easting data)

 

Now, in 2005 we did a report on Casualty Estimation Methodologies (it is report CE-1 list here: http://www.dupuyinstitute.org/tdipub3.htm). We reviewed the listing of validation efforts, and from 1997 to 2005…nothing new had been done (except for a battalion-level validation we had done for the TNDM). So am I now to believe that since 2009, they have actively and aggressively pursued validation? Especially as most of this time was in a period of severely declining budgets, I doubt it. One of the arguments against validation made in meetings I attended in 1987 was that they did not have the time or budget to spend on validating. The budget during the Cold War was luxurious by today’s standards.

If there have been meaningful validations done, I would love to see the validation reports. The proof is in the pudding…..send me the validation reports that will resolve all doubts.

Engaging the Phalanx

The Military Operations Research Society (MORS) publishes a periodical journal called the Phalanx. In the December 2018 issue was an article that referenced one of our blog posts. This took us by surprise. We only found out about thanks to one of the viewers of this blog. We are not members of MORS. The article is paywalled and cannot be easily accessed if you are not a member.

It is titled “Perspectives on Combat Modeling” (page 28) and is written by Jonathan K. Alt, U.S. Army TRADOC Analysis Center, Monterey, CA.; Christopher Morey, PhD, Training and Doctrine Command Analysis Center, Ft. Leavenworth, Kansas; and Larry Larimer, Training and Doctrine Command Analysis Center, White Sands, New Mexico. I am not familiar with any of these three gentlemen.

The blog post that appears to be generating this article is this one:

Wargaming Multi-Domain Battle: The Base Of Sand Problem

Simply by coincidence, Shawn Woodford recently re-posted this in January. It was originally published on 10 April 2017 and was written by Shawn.

The opening two sentences of the article in the Phalanx reads:

Periodically, within the Department of Defense (DoD) analytic community, questions will arise regarding the validity of the combat models and simulations used to support analysis. Many attempts (sic) to resurrect the argument that models, simulations, and wargames “are built on the thin foundation of empirical knowledge about the phenomenon of combat.” (Woodford, 2017).

It is nice to be acknowledged, although it this case, it appears that we are being acknowledged because they disagree with what we are saying.

Probably the word that gets my attention is “resurrect.” It is an interesting word, that implies that this is an old argument that has somehow or the other been put to bed. Granted it is an old argument. On the other hand, it has not been put to bed. If a problem has been identified and not corrected, then it is still a problem. Age has nothing to do with it.

On the other hand, maybe they are using the word “resurrect” because recent developments in modeling and validation have changed the environment significantly enough that these arguments no longer apply. If so, I would be interested in what those changes are. The last time I checked, the modeling and simulation industry was using many of the same models they had used for decades. In some cases, were going back to using simpler hex-games for their modeling and wargaming efforts. We have blogged a couple of times about these efforts. So, in the world of modeling, unless there have been earthshaking and universal changes made in the last five years that have completely revamped the landscape….then the decades old problems still apply to the decades old models and simulations.

More to come (this is the first of at least 7 posts on this subject).

What Multi-Domain Operations Wargames Are You Playing? [Updated]

Source: David A. Shlapak and Michael Johnson. Reinforcing Deterrence on NATO’s Eastern Flank: Wargaming the Defense of the Baltics. Santa Monica, CA: RAND Corporation, 2016.

 

 

 

 

 

 

 

[UPDATE] We had several readers recommend games they have used or would be suitable for simulating Multi-Domain Battle and Operations (MDB/MDO) concepts. These include several classic campaign-level board wargames:

The Next War (SPI, 1976)

NATO: The Next War in Europe (Victory Games, 1983)

For tactical level combat, there is Steel Panthers: Main Battle Tank (SSI/Shrapnel Games, 1996- )

There were also a couple of naval/air oriented games:

Asian Fleet (Kokusai-Tsushin Co., Ltd. (国際通信社) 2007, 2010)

Command: Modern Air Naval Operations (Matrix Games, 2014)

Are there any others folks are using out there?


A Mystics & Statistic reader wants to know what wargames are being used to simulate and explore Multi-Domain Battle and Operations (MDB/MDO) concepts?

There is a lot of MDB/MDO wargaming going on in at all levels in the U.S. Department of Defense. Much of this appears to use existing models, simulations, and wargames, such as the U.S. Army Center for Army Analysis’s unclassified Wargaming Analysis Model (C-WAM).

Chris Lawrence recently looked at C-WAM and found that it uses a lot of traditional board wargaming elements, including methodologies for determining combat results, casualties, and breakpoints that have been found unable to replicate real-world outcomes (aka “The Base of Sand” problem).

C-WAM 1

C-WAM 2

C-WAM 3

C-WAM 4 (Breakpoints)

There is also the wargame used by RAND to look at possible scenarios for a potential Russian invasion of the Baltic States.

Wargaming the Defense of the Baltics

Wargaming at RAND

What other wargames, models, and simulations are there being used out there? Are there any commercial wargames incorporating MDB/MDO elements into their gameplay? What methodologies are being used to portray MDB/MDO effects?

Wargaming Multi-Domain Battle: The Base Of Sand Problem

“JTLS Overview Movie by Rolands & Associates” [YouTube]

[This piece was originally posted on 10 April 2017.]

As the U.S. Army and U.S. Marine Corps work together to develop their joint Multi-Domain Battle concept, wargaming and simulation will play a significant role. Aspects of the construct have already been explored through the Army’s Unified Challenge, Joint Warfighting Assessment, and Austere Challenge exercises, and upcoming Unified Quest and U.S. Army, Pacific war games and exercises. U.S. Pacific Command and U.S. European Command also have simulations and exercises scheduled.

A great deal of importance has been placed on the knowledge derived from these activities. As the U.S. Army Training and Doctrine Command recently stated,

Concept analysis informed by joint and multinational learning events…will yield the capabilities required of multi-domain battle. Resulting doctrine, organization, training, materiel, leadership, personnel and facilities solutions will increase the capacity and capability of the future force while incorporating new formations and organizations.

There is, however, a problem afflicting the Defense Department’s wargames, of which the military operations research and models and simulations communities have long been aware, but have been slow to address: their models are built on a thin foundation of empirical knowledge about the phenomenon of combat. None have proven the ability to replicate real-world battle experience. This is known as the “base of sand” problem.

A Brief History of The Base of Sand

All combat models and simulations are abstracted theories of how combat works. Combat modeling in the United States began in the early 1950s as an extension of military operations research that began during World War II. Early model designers did not have large base of empirical combat data from which to derive their models. Although a start had been made during World War II and the Korean War to collect real-world battlefield data from observation and military unit records, an effort that provided useful initial insights, no systematic effort has ever been made to identify and assemble such information. In the absence of extensive empirical combat data, model designers turned instead to concepts of combat drawn from official military doctrine (usually of uncertain provenance), subject matter expertise, historians and theorists, the physical sciences, or their own best guesses.

As the U.S. government’s interest in scientific management methods blossomed in the late 1950s and 1960s, the Defense Department’s support for operations research and use of combat modeling in planning and analysis grew as well. By the early 1970s, it became evident that basic research on combat had not kept pace. A survey of existing combat models by Gary Shubik and Martin Brewer for RAND in 1972 concluded that

Basic research and knowledge is lacking. The majority of the MSGs [models, simulations and games] sampled are living off a very slender intellectual investment in fundamental knowledge…. [T]he need for basic research is so critical that if no other funding were available we would favor a plan to reduce by a significant proportion all current expenditures for MSGs and to use the saving for basic research.

In 1975, John Stockfish took a direct look at the use of data and combat models for managing decisions regarding conventional military forces for RAND. He emphatically stated that “[T]he need for better and more empirical work, including operational testing, is of such a magnitude that a major reallocating of talent from model building to fundamental empirical work is called for.”

In 1991, Paul K. Davis, an analyst for RAND, and Donald Blumenthal, a consultant to the Livermore National Laboratory, published an assessment of the state of Defense Department combat modeling. It began as a discussion between senior scientists and analysts from RAND, Livermore, and the NASA Jet Propulsion Laboratory, and the Defense Advanced Research Projects Agency (DARPA) sponsored an ensuing report, The Base of Sand Problem: A White Paper on the State of Military Combat Modeling.

Davis and Blumenthal contended

The [Defense Department] is becoming critically dependent on combat models (including simulations and war games)—even more dependent than in the past. There is considerable activity to improve model interoperability and capabilities for distributed war gaming. In contrast to this interest in model-related technology, there has been far too little interest in the substance of the models and the validity of the lessons learned from using them. In our view, the DoD does not appreciate that in many cases the models are built on a base of sand…

[T]he DoD’s approach in developing and using combat models, including simulations and war games, is fatally flawed—so flawed that it cannot be corrected with anything less than structural changes in management and concept. [Original emphasis]

As a remedy, the authors recommended that the Defense Department create an office to stimulate a national military science program. This Office of Military Science would promote and sponsor basic research on war and warfare while still relying on the military services and other agencies for most research and analysis.

Davis and Blumenthal initially drafted their white paper before the 1991 Gulf War, but the performance of the Defense Department’s models and simulations in that conflict underscored the very problems they described. Defense Department wargames during initial planning for the conflict reportedly predicted tens of thousands of U.S. combat casualties. These simulations were said to have led to major changes in U.S. Central Command’s operational plan. When the casualty estimates leaked, they caused great public consternation and inevitable Congressional hearings.

While all pre-conflict estimates of U.S. casualties in the Gulf War turned out to be too high, the Defense Department’s predictions were the most inaccurate, by several orders of magnitude. This performance, along with Davis and Blumenthal’s scathing critique, should have called the Defense Department’s entire modeling and simulation effort into question. But it did not.

The Problem Persists

The Defense Department’s current generation of models and simulations harbor the same weaknesses as the ones in use in the 1990s. Some are new iterations of old models with updated graphics and code, but using the same theoretical assumptions about combat. In most cases, no one other than the designers knows exactly what data and concepts the models are based upon. This practice is known in the technology world as black boxing. While black boxing may be an essential business practice in the competitive world of government consulting, it makes independently evaluating the validity of combat models and simulations nearly impossible. This should be of major concern because many models and simulations in use today contain known flaws.

Some, such as  Joint Theater Level Simulation (JTLS), use the Lanchester equations for calculating attrition in ground combat. However, multiple studies have shown that these equations are incapable of replicating real-world combat. British engineer Frederick W. Lanchester developed and published them in 1916 as an abstract conceptualization of aerial combat, stating himself that he did not believe they were applicable to ground combat. If Lanchester-based models cannot accurately represent historical combat, how can there be any confidence that they are realistically predicting future combat?

Others, such as the Joint Conflict And Tactical Simulation (JCATS), MAGTF Tactical Warfare System (MTWS), and Warfighters’ Simulation (WARSIM) adjudicate ground combat using probability of hit/probability of kill (pH/pK) algorithms. Corps Battle Simulation (CBS) uses pH/pK for direct fire attrition and a modified version of Lanchester for indirect fire. While these probabilities are developed from real-world weapon system proving ground data, their application in the models is combined with inputs from subjective sources, such as outputs from other combat models, which are likely not based on real-world data. Multiplying an empirically-derived figure by a judgement-based coefficient results in a judgement-based estimate, which might be accurate or it might not. No one really knows.

Potential Remedies

One way of assessing the accuracy of these models and simulations would be to test them against real-world combat data, which does exist. In theory, Defense Department models and simulations are supposed to be subjected to validation, verification, and accreditation, but in reality this is seldom, if ever, rigorously done. Combat modelers could also open the underlying theories and data behind their models and simulations for peer review.

The problem is not confined to government-sponsored research and development. In his award-winning 2004 book examining the bases for victory and defeat in battle, Military Power: Explaining Victory and Defeat in Modern Battle, analyst Stephen Biddle noted that the study of military science had been neglected in the academic world as well. “[F]or at least a generation, the study of war’s conduct has fallen between the stools of the institutional structure of modern academia and government,” he wrote.

This state of affairs seems remarkable given the enormous stakes that are being placed on the output of the Defense Department’s modeling and simulation activities. After decades of neglect, remedying this would require a dedicated commitment to sustained basic research on the military science of combat and warfare, with no promise of a tangible short-term return on investment. Yet, as Biddle pointed out, “With so much at stake, we surely must do better.”

[NOTE: The attrition methodologies used in CBS and WARSIM have been corrected since this post was originally published per comments provided by their developers.]

Comparing Force Ratios to Casualty Exchange Ratios

“American Marines in Belleau Wood (1918)” by Georges Scott [Wikipedia]

Comparing Force Ratios to Casualty Exchange Ratios
Christopher A. Lawrence

[The article below is reprinted from the Summer 2009 edition of The International TNDM Newsletter.]

There are three versions of force ratio versus casualty exchange ratio rules, such as the three-to-one rule (3-to-1 rule), as it applies to casualties. The earliest version of the rule as it relates to casualties that we have been able to find comes from the 1958 version of the U.S. Army Maneuver Control manual, which states: “When opposing forces are in contact, casualties are assessed in inverse ratio to combat power. For friendly forces advancing with a combat power superiority of 5 to 1, losses to friendly forces will be about 1/5 of those suffered by the opposing force.”[1]

The RAND version of the rule (1992) states that: “the famous ‘3:1 rule ’, according to which the attacker and defender suffer equal fractional loss rates at a 3:1 force ratio the battle is in mixed terrain and the defender enjoys ‘prepared ’defenses…” [2]

Finally, there is a version of the rule that dates from the 1967 Maneuver Control manual that only applies to armor that shows:

As the RAND construct also applies to equipment losses, then this formulation is directly comparable to the RAND construct.

Therefore, we have three basic versions of the 3-to-1 rule as it applies to casualties and/or equipment losses. First, there is a rule that states that there is an even fractional loss ratio at 3-to-1 (the RAND version), Second, there is a rule that states that at 3-to-1, the attacker will suffer one-third the losses of the defender. And third, there is a rule that states that at 3-to-1, the attacker and defender will suffer the same losses as the defender. Furthermore, these examples are highly contradictory, with either the attacker suffering three times the losses of the defender, the attacker suffering the same losses as the defender, or the attacker suffering 1/3 the losses of the defender.

Therefore, what we will examine here is the relationship between force ratios and exchange ratios. In this case, we will first look at The Dupuy Institute’s Battles Database (BaDB), which covers 243 battles from 1600 to 1900. We will chart on the y-axis the force ratio as measured by a count of the number of people on each side of the forces deployed for battle. The force ratio is the number of attackers divided by the number of defenders. On the x-axis is the exchange ratio, which is a measured by a count of the number of people on each side who were killed, wounded, missing or captured during that battle. It does not include disease and non-battle injuries. Again, it is calculated by dividing the total attacker casualties by the total defender casualties. The results are provided below:

As can be seen, there are a few extreme outliers among these 243 data points. The most extreme, the Battle of Tippennuir (l Sep 1644), in which an English Royalist force under Montrose routed an attack by Scottish Covenanter militia, causing about 3,000 casualties to the Scots in exchange for a single (allegedly self-inflicted) casualty to the Royalists, was removed from the chart. This 3,000-to-1 loss ratio was deemed too great an outlier to be of value in the analysis.

As it is, the vast majority of cases are clumped down into the corner of the graph with only a few scattered data points outside of that clumping. If one did try to establish some form of curvilinear relationship, one would end up drawing a hyperbola. It is worthwhile to look inside that clump of data to see what it shows. Therefore, we will look at the graph truncated so as to show only force ratios at or below 20-to-1 and exchange rations at or below 20-to-1.

Again, the data remains clustered in one corner with the outlying data points again pointing to a hyperbola as the only real fitting curvilinear relationship. Let’s look at little deeper into the data by truncating the data on 6-to-1 for both force ratios and exchange ratios. As can be seen, if the RAND version of the 3-to-1 rule is correct, then the data should show at 3-to-1 force ratio a 3-to-1 casualty exchange ratio. There is only one data point that comes close to this out of the 243 points we examined.

If the FM 105-5 version of the rule as it applies to armor is correct, then the data should show that at 3-to-1 force ratio there is a 1-to-1 casualty exchange ratio, at a 4-to-1 force ratio a 1-to-2 casualty exchange ratio, and at a 5-to-1 force ratio a 1-to-3 casualty exchange ratio. Of course, there is no armor in these pre-WW I engagements, but again no such exchange pattern does appear.

If the 1958 version of the FM 105-5 rule as it applies to casualties is correct, then the data should show that at a 3-to-1 force ratio there is 0.33-to-1 casualty exchange ratio, at a 4-to-1 force ratio a .25-to-1 casualty exchange ratio, and at a 5-to-1 force ratio a 0.20-to-5 casualty exchange ratio. As can be seen, there is not much indication of this pattern, or for that matter any of the three patterns.

Still, such a construct may not be relevant to data before 1900. For example, Lanchester claimed in 1914 in Chapter V, “The Principal of Concentration,” of his book Aircraft in Warfare, that there is greater advantage to be gained in modern warfare from concentration of fire.[3] Therefore, we will tap our more modern Division-Level Engagement Database (DLEDB) of 675 engagements, of which 628 have force ratios and exchange ratios calculated for them. These 628 cases are then placed on a scattergram to see if we can detect any similar patterns.

Even though this data covers from 1904 to 1991, with the vast majority of the data coming from engagements after 1940, one again sees the same pattern as with the data from 1600-1900. If there is a curvilinear relationship, it is again a hyperbola. As before, it is useful to look into the mass of data clustered into the corner by truncating the force and exchange ratios at 20-to-1. This produces the following:

Again, one sees the data clustered in the corner, with any curvilinear relationship again being a hyperbola. A look at the data further truncated to a 10-to-1 force or exchange ratio does not yield anything more revealing.

And, if this data is truncated to show only 5-to-1 force ratio and exchange ratios, one again sees:

Again, this data appears to be mostly just noise, with no clear patterns here that support any of the three constructs. In the case of the RAND version of the 3-to-1 rule, there is again only one data point (out of 628) that is anywhere close to the crossover point (even fractional exchange rate) that RAND postulates. In fact, it almost looks like the data conspires to make sure it leaves a noticeable “hole” at that point. The other postulated versions of the 3-to-1 rules are also given no support in these charts.

Also of note, that the relationship between force ratios and exchange ratios does not appear to significantly change for combat during 1600-1900 when compared to the data from combat from 1904-1991. This does not provide much support for the intellectual construct developed by Lanchester to argue for his N-square law.

While we can attempt to torture the data to find a better fit, or can try to argue that the patterns are obscured by various factors that have not been considered, we do not believe that such a clear pattern and relationship exists. More advanced mathematical methods may show such a pattern, but to date such attempts have not ferreted out these alleged patterns. For example, we refer the reader to Janice Fain’s article on Lanchester equations, The Dupuy Institute’s Capture Rate Study, Phase I & II, or any number of other studies that have looked at Lanchester.[4]

The fundamental problem is that there does not appear to be a direct cause and effect between force ratios and exchange ratios. It appears to be an indirect relationship in the sense that force ratios are one of several independent variables that determine the outcome of an engagement, and the nature of that outcome helps determines the casualties. As such, there is a more complex set of interrelationships that have not yet been fully explored in any study that we know of, although it is briefly addressed in our Capture Rate Study, Phase I & II.

NOTES

[1] FM 105-5, Maneuver Control (1958), 80.

[2] Patrick Allen, “Situational Force Scoring: Accounting for Combined Arms Effects in Aggregate Combat Models,” (N-3423-NA, The RAND Corporation, Santa Monica, CA, 1992), 20.

[3] F. W. Lanchester, Aircraft in Warfare: The Dawn of the Fourth Arm (Lanchester Press Incorporated, Sunnyvale, Calif., 1995), 46-60. One notes that Lanchester provided no data to support these claims, but relied upon an intellectual argument based upon a gross misunderstanding of ancient warfare.

[4] In particular, see page 73 of Janice B. Fain, “The Lanchester Equations and Historical Warfare: An Analysis of Sixty World War II Land Engagements,” Combat Data Subscription Service (HERO, Arlington, Va., Spring 1975).

Are There Only Three Ways of Assessing Military Power?

military-power[This article was originally posted on 11 October 2016]

In 2004, military analyst and academic Stephen Biddle published Military Power: Explaining Victory and Defeat in Modern Battle, a book that addressed the fundamental question of what causes victory and defeat in battle. Biddle took to task the study of the conduct of war, which he asserted was based on “a weak foundation” of empirical knowledge. He surveyed the existing literature on the topic and determined that the plethora of theories of military success or failure fell into one of three analytical categories: numerical preponderance, technological superiority, or force employment.

Numerical preponderance theories explain victory or defeat in terms of material advantage, with the winners possessing greater numbers of troops, populations, economic production, or financial expenditures. Many of these involve gross comparisons of numbers, but some of the more sophisticated analyses involve calculations of force density, force-to-space ratios, or measurements of quality-adjusted “combat power.” Notions of threshold “rules of thumb,” such as the 3-1 rule, arise from this. These sorts of measurements form the basis for many theories of power in the study of international relations.

The next most influential means of assessment, according to Biddle, involve views on the primacy of technology. One school, systemic technology theory, looks at how technological advances shift balances within the international system. The best example of this is how the introduction of machine guns in the late 19th century shifted the advantage in combat to the defender, and the development of the tank in the early 20th century shifted it back to the attacker. Such measures are influential in international relations and political science scholarship.

The other school of technological determinacy is dyadic technology theory, which looks at relative advantages between states regardless of posture. This usually involves detailed comparisons of specific weapons systems, tanks, aircraft, infantry weapons, ships, missiles, etc., with the edge going to the more sophisticated and capable technology. The use of Lanchester theory in operations research and combat modeling is rooted in this thinking.

Biddle identified the third category of assessment as subjective assessments of force employment based on non-material factors including tactics, doctrine, skill, experience, morale or leadership. Analyses on these lines are the stock-in-trade of military staff work, military historians, and strategic studies scholars. However, international relations theorists largely ignore force employment and operations research combat modelers tend to treat it as a constant or omit it because they believe its effects cannot be measured.

The common weakness of all of these approaches, Biddle argued, is that “there are differing views, each intuitively plausible but none of which can be considered empirically proven.” For example, no one has yet been able to find empirical support substantiating the validity of the 3-1 rule or Lanchester theory. Biddle notes that the track record for predictions based on force employment analyses has also been “poor.” (To be fair, the problem of testing theory to see if applies to the real world is not limited to assessments of military power, it afflicts security and strategic studies generally.)

So, is Biddle correct? Are there only three ways to assess military outcomes? Are they valid? Can we do better?

The Lanchester Equations and Historical Warfare

Allied force dispositions at the Battle of Anzio, on 1 February 1944. [U.S. Army/Wikipedia]

[The article below is reprinted from History, Numbers And War: A HERO Journal, Vol. 1, No. 1, Spring 1977, pp. 34-52]

The Lanchester Equations and Historical Warfare: An Analysis of Sixty World War II Land Engagements

By Janice B. Fain

Background and Objectives

The method by which combat losses are computed is one of the most critical parts of any combat model. The Lanchester equations, which state that a unit’s combat losses depend on the size of its opponent, are widely used for this purpose.

In addition to their use in complex dynamic simulations of warfare, the Lanchester equations have also sewed as simple mathematical models. In fact, during the last decade or so there has been an explosion of theoretical developments based on them. By now their variations and modifications are numerous, and “Lanchester theory” has become almost a separate branch of applied mathematics. However, compared with the effort devoted to theoretical developments, there has been relatively little empirical testing of the basic thesis that combat losses are related to force sizes.

One of the first empirical studies of the Lanchester equations was Engel’s classic work on the Iwo Jima campaign in which he found a reasonable fit between computed and actual U.S. casualties (Note 1). Later studies were somewhat less supportive (Notes 2 and 3), but an investigation of Korean war battles showed that, when the simulated combat units were constrained to follow the tactics of their historical counterparts, casualties during combat could be predicted to within 1 to 13 percent (Note 4).

Taken together, these various studies suggest that, while the Lanchester equations may be poor descriptors of large battles extending over periods during which the forces were not constantly in combat, they may be adequate for predicting losses while the forces are actually engaged in fighting. The purpose of the work reported here is to investigate 60 carefully selected World War II engagements. Since the durations of these battles were short (typically two to three days), it was expected that the Lanchester equations would show a closer fit than was found in studies of larger battles. In particular, one of the objectives was to repeat, in part, Willard’s work on battles of the historical past (Note 3).

The Data Base

Probably the most nearly complete and accurate collection of combat data is the data on World War II compiled by the Historical Evaluation and Research Organization (HERO). From their data HERO analysts selected, for quantitative analysis, the following 60 engagements from four major Italian campaigns:

Salerno, 9-18 Sep 1943, 9 engagements

Volturno, 12 Oct-8 Dec 1943, 20 engagements

Anzio, 22 Jan-29 Feb 1944, 11 engagements

Rome, 14 May-4 June 1944, 20 engagements

The complete data base is described in a HERO report (Note 5). The work described here is not the first analysis of these data. Statistical analyses of weapon effectiveness and the testing of a combat model (the Quantified Judgment Method, QJM) have been carried out (Note 6). The work discussed here examines these engagements from the viewpoint of the Lanchester equations to consider the question: “Are casualties during combat related to the numbers of men in the opposing forces?”

The variables chosen for this analysis are shown in Table 1. The “winners” of the engagements were specified by HERO on the basis of casualties suffered, distance advanced, and subjective estimates of the percentage of the commander’s objective achieved. Variable 12, the Combat Power Ratio, is based on the Operational Lethality Indices (OLI) of the units (Note 7).

The general characteristics of the engagements are briefly described. Of the 60, there were 19 attacks by British forces, 28 by U.S. forces, and 13 by German forces. The attacker was successful in 34 cases; the defender, in 23; and the outcomes of 3 were ambiguous. With respect to terrain, 19 engagements occurred in flat terrain; 24 in rolling, or intermediate, terrain; and 17 in rugged, or difficult, terrain. Clear weather prevailed in 40 cases; 13 engagements were fought in light or intermittent rain; and 7 in medium or heavy rain. There were 28 spring and summer engagements and 32 fall and winter engagements.

Comparison of World War II Engagements With Historical Battles

Since one purpose of this work is to repeat, in part, Willard’s analysis, comparison of these World War II engagements with the historical battles (1618-1905) studied by him will be useful. Table 2 shows a comparison of the distribution of battles by type. Willard’s cases were divided into two categories: I. meeting engagements, and II. sieges, attacks on forts, and similar operations. HERO’s World War II engagements were divided into four types based on the posture of the defender: 1. delay, 2. hasty defense, 3. prepared position, and 4. fortified position. If postures 1 and 2 are considered very roughly equivalent to Willard’s category I, then in both data sets the division into the two gross categories is approximately even.

The distribution of engagements across force ratios, given in Table 3, indicated some differences. Willard’s engagements tend to cluster at the lower end of the scale (1-2) and at the higher end (4 and above), while the majority of the World War II engagements were found in mid-range (1.5 – 4) (Note 8). The frequency with which the numerically inferior force achieved victory is shown in Table 4. It is seen that in neither data set are force ratios good predictors of success in battle (Note 9).

Table 3.

Results of the Analysis Willard’s Correlation Analysis

There are two forms of the Lanchester equations. One represents the case in which firing units on both sides know the locations of their opponents and can shift their fire to a new target when a “kill” is achieved. This leads to the “square” law where the loss rate is proportional to the opponent’s size. The second form represents that situation in which only the general location of the opponent is known. This leads to the “linear” law in which the loss rate is proportional to the product of both force sizes.

As Willard points out, large battles are made up of many smaller fights. Some of these obey one law while others obey the other, so that the overall result should be a combination of the two. Starting with a general formulation of Lanchester’s equations, where g is the exponent of the target unit’s size (that is, g is 0 for the square law and 1 for the linear law), he derives the following linear equation:

log (nc/mc) = log E + g log (mo/no) (1)

where nc and mc are the casualties, E is related to the exchange ratio, and mo and no are the initial force sizes. Linear regression produces a value for g. However, instead of lying between 0 and 1, as expected, the) g‘s range from -.27 to -.87, with the majority lying around -.5. (Willard obtains several values for g by dividing his data base in various ways—by force ratio, by casualty ratio, by historical period, and so forth.) A negative g value is unpleasant. As Willard notes:

Military theorists should be disconcerted to find g < 0, for in this range the results seem to imply that if the Lanchester formulation is valid, the casualty-producing power of troops increases as they suffer casualties (Note 3).

From his results, Willard concludes that his analysis does not justify the use of Lanchester equations in large-scale situations (Note 10).

Analysis of the World War II Engagements

Willard’s computations were repeated for the HERO data set. For these engagements, regression produced a value of -.594 for g (Note 11), in striking agreement with Willard’s results. Following his reasoning would lead to the conclusion that either the Lanchester equations do not represent these engagements, or that the casualty producing power of forces increases as their size decreases.

However, since the Lanchester equations are so convenient analytically and their use is so widespread, it appeared worthwhile to reconsider this conclusion. In deriving equation (1), Willard used binomial expansions in which he retained only the leading terms. It seemed possible that the poor results might he due, in part, to this approximation. If the first two terms of these expansions are retained, the following equation results:

log (nc/mc) = log E + log (Mo-mc)/(no-nc) (2)

Repeating this regression on the basis of this equation leads to g = -.413 (Note 12), hardly an improvement over the initial results.

A second attempt was made to salvage this approach. Starting with raw OLI scores (Note 7), HERO analysts have computed “combat potentials” for both sides in these engagements, taking into account the operational factors of posture, vulnerability, and mobility; environmental factors like weather, season, and terrain; and (when the record warrants) psychological factors like troop training, morale, and the quality of leadership. Replacing the factor (mo/no) in Equation (1) by the combat power ratio produces the result) g = .466 (Note 13).

While this is an apparent improvement in the value of g, it is achieved at the expense of somewhat distorting the Lanchester concept. It does preserve the functional form of the equations, but it requires a somewhat strange definition of “killing rates.”

Analysis Based on the Differential Lanchester Equations

Analysis of the type carried out by Willard appears to produce very poor results for these World War II engagements. Part of the reason for this is apparent from Figure 1, which shows the scatterplot of the dependent variable, log (nc/mc), against the independent variable, log (mo/no). It is clear that no straight line will fit these data very well, and one with a positive slope would not be much worse than the “best” line found by regression. To expect the exponent to account for the wide variation in these data seems unreasonable.

Here, a simpler approach will be taken. Rather than use the data to attempt to discriminate directly between the square and the linear laws, they will be used to estimate linear coefficients under each assumption in turn, starting with the differential formulation rather than the integrated equations used by Willard.

In their simplest differential form, the Lanchester equations may be written;

Square Law; dA/dt = -kdD and dD/dt = kaA (3)

Linear law: dA/dt = -k’dAD and dD/dt = k’aAD (4)

where

A(D) is the size of the attacker (defender)

dA/dt (dD/dt) is the attacker’s (defender’s) loss rate,

ka, k’a (kd, k’d) are the attacker’s (defender’s) killing rates

For this analysis, the day is taken as the basic time unit, and the loss rate per day is approximated by the casualties per day. Results of the linear regressions are given in Table 5. No conclusions should be drawn from the fact that the correlation coefficients are higher in the linear law case since this is expected for purely technical reasons (Note 14). A better picture of the relationships is again provided by the scatterplots in Figure 2. It is clear from these plots that, as in the case of the logarithmic forms, a single straight line will not fit the entire set of 60 engagements for either of the dependent variables.

To investigate ways in which the data set might profitably be subdivided for analysis, T-tests of the means of the dependent variable were made for several partitionings of the data set. The results, shown in Table 6, suggest that dividing the engagements by defense posture might prove worthwhile.

Results of the linear regressions by defense posture are shown in Table 7. For each posture, the equation that seemed to give a better fit to the data is underlined (Note 15). From this table, the following very tentative conclusions might be drawn:

  • In an attack on a fortified position, the attacker suffers casualties by the square law; the defender suffers casualties by the linear law. That is, the defender is aware of the attacker’s position, while the attacker knows only the general location of the defender. (This is similar to Deitchman’s guerrilla model. Note 16).
  • This situation is apparently reversed in the cases of attacks on prepared positions and hasty defenses.
  • Delaying situations seem to be treated better by the square law for both attacker and defender.

Table 8 summarizes the killing rates by defense posture. The defender has a much higher killing rate than the attacker (almost 3 to 1) in a fortified position. In a prepared position and hasty defense, the attacker appears to have the advantage. However, in a delaying action, the defender’s killing rate is again greater than the attacker’s (Note 17).

Figure 3 shows the scatterplots for these cases. Examination of these plots suggests that a tentative answer to the study question posed above might be: “Yes, casualties do appear to be related to the force sizes, but the relationship may not be a simple linear one.”

In several of these plots it appears that two or more functional forms may be involved. Consider, for example, the defender‘s casualties as a function of the attacker’s initial strength in the case of a hasty defense. This plot is repeated in Figure 4, where the points appear to fit the curves sketched there. It would appear that there are at least two, possibly three, separate relationships. Also on that plot, the individual engagements have been identified, and it is interesting to note that on the curve marked (1), five of the seven attacks were made by Germans—four of them from the Salerno campaign. It would appear from this that German attacks are associated with higher than average defender casualties for the attacking force size. Since there are so few data points, this cannot be more than a hint or interesting suggestion.

Future Research

This work suggests two conclusions that might have an impact on future lines of research on combat dynamics:

  • Tactics appear to be an important determinant of combat results. This conclusion, in itself, does not appear startling, at least not to the military. However, it does not always seem to have been the case that tactical questions have been considered seriously by analysts in their studies of the effects of varying force levels and force mixes.
  • Historical data of this type offer rich opportunities for studying the effects of tactics. For example, consideration of the narrative accounts of these battles might permit re-coding the engagements into a larger, more sensitive set of engagement categories. (It would, of course, then be highly desirable to add more engagements to the data set.)

While predictions of the future are always dangerous, I would nevertheless like to suggest what appears to be a possible trend. While military analysis of the past two decades has focused almost exclusively on the hardware of weapons systems, at least part of our future analysis will be devoted to the more behavioral aspects of combat.

Janice Bloom Fain, a Senior Associate of CACI, lnc., is a physicist whose special interests are in the applications of computer simulation techniques to industrial and military operations; she is the author of numerous reports and articles in this field. This paper was presented by Dr. Fain at the Military Operations Research Symposium at Fort Eustis, Virginia.

NOTES

[1.] J. H. Engel, “A Verification of Lanchester’s Law,” Operations Research 2, 163-171 (1954).

[2.] For example, see R. L. Helmbold, “Some Observations on the Use of Lanchester’s Theory for Prediction,” Operations Research 12, 778-781 (1964); H. K. Weiss, “Lanchester-Type Models of Warfare,” Proceedings of the First International Conference on Operational Research, 82-98, ORSA (1957); H. K. Weiss, “Combat Models and Historical Data; The U.S. Civil War,” Operations Research 14, 750-790 (1966).

[3.] D. Willard, “Lanchester as a Force in History: An Analysis of Land Battles of the Years 1618-1905,” RAC-TD-74, Research Analysis Corporation (1962). what appears to be a possible trend. While military analysis of the past two decades has focused almost exclusively on the hardware of weapons systems, at least part of our future analysis will be devoted to the more behavioral aspects of combat.

[4.] The method of computing the killing rates forced a fit at the beginning and end of the battles. See W. Fain, J. B. Fain, L. Feldman, and S. Simon, “Validation of Combat Models Against Historical Data,” Professional Paper No. 27, Center for Naval Analyses, Arlington, Virginia (1970).

[5.] HERO, “A Study of the Relationship of Tactical Air Support Operations to Land Combat, Appendix B, Historical Data Base.” Historical Evaluation and Research Organization, report prepared for the Defense Operational Analysis Establishment, U.K.T.S.D., Contract D-4052 (1971).

[6.] T. N. Dupuy, The Quantified Judgment Method of Analysis of Historical Combat Data, HERO Monograph, (January 1973); HERO, “Statistical Inference in Analysis in Combat,” Annex F, Historical Data Research on Tactical Air Operations, prepared for Headquarters USAF, Assistant Chief of Staff for Studies and Analysis, Contract No. F-44620-70-C-0058 (1972).

[7.] The Operational Lethality Index (OLI) is a measure of weapon effectiveness developed by HERO.

[8.] Since Willard’s data did not indicate which side was the attacker, his force ratio is defined to be (larger force/smaller force). The HERO force ratio is (attacker/defender).

[9.] Since the criteria for success may have been rather different for the two sets of battles, this comparison may not be very meaningful.

[10.] This work includes more complex analysis in which the possibility that the two forces may be engaging in different types of combat is considered, leading to the use of two exponents rather than the single one, Stochastic combat processes are also treated.

[11.] Correlation coefficient = -.262;

Intercept = .00115; slope = -.594.

[12.] Correlation coefficient = -.184;

Intercept = .0539; slope = -,413.

[13.] Correlation coefficient = .303;

Intercept = -.638; slope = .466.

[14.] Correlation coefficients for the linear law are inflated with respect to the square law since the independent variable is a product of force sizes and, thus, has a higher variance than the single force size unit in the square law case.

[15.] This is a subjective judgment based on the following considerations Since the correlation coefficient is inflated for the linear law, when it is lower the square law case is chosen. When the linear law correlation coefficient is higher, the case with the intercept closer to 0 is chosen.

[16.] S. J. Deitchman, “A Lanchester Model of Guerrilla Warfare,” Operations Research 10, 818-812 (1962).

[17.] As pointed out by Mr. Alan Washburn, who prepared a critique on this paper, when comparing numerical values of the square law and linear law killing rates, the differences in units must be considered. (See footnotes to Table 7).

Assessing The Assessments Of The Military Balance In The China Seas

“If we maintain our faith in God, love of freedom, and superior global airpower, the future [of the US] looks good.” — U.S. Air Force General Curtis E. LeMay (Commander, U.S. Strategic Command, 1948-1957)

Curtis LeMay was involved in the formation of RAND Corporation after World War II. RAND created several models to measure the dynamics of the US-China military balance over time. Since 1996, this has been computed for two scenarios, differing by range from mainland China: one over Taiwan and the other over the Spratly Islands. The results of the model results for selected years can be seen in the graphic below.

The capabilities listed in the RAND study are interesting, notable in that the air superiority category, rough parity exists as of 2017. Also, the ability to attack air bases has given an advantage to the Chinese forces.

Investigating the methodology used does not yield any precise quantitative modeling examples, as would be expected in a rigorous academic effort, although there is some mention of statistics, simulation and historical examples.

The analysis presented here necessarily simplifies a great number of conflict characteristics. The emphasis throughout is on developing and assessing metrics in each area that provide a sense of the level of difficulty faced by each side in achieving its objectives. Apart from practical limitations, selectivity is driven largely by the desire to make the work transparent and replicable. Moreover, given the complexities and uncertainties in modern warfare, one could make the case that it is better to capture a handful of important dynamics than to present the illusion of comprehensiveness and precision. All that said, the analysis is grounded in recognized conclusions from a variety of historical sources on modern warfare, from the air war over Korea and Vietnam to the naval conflict in the Falklands and SAM hunting in Kosovo and Iraq. [Emphasis added].

We coded most of the scorecards (nine out of ten) using a five-color stoplight scheme to denote major or minor U.S. advantage, a competitive situation, or major or minor Chinese advantage. Advantage, in this case, means that one side is able to achieve its primary objectives in an operationally relevant time frame while the other side would have trouble in doing so. [Footnote] For example, even if the U.S. military could clear the skies of Chinese escort fighters with minimal friendly losses, the air superiority scorecard could be coded as “Chinese advantage” if the United States cannot prevail while the invasion hangs in the balance. If U.S. forces cannot move on to focus on destroying attacking strike and bomber aircraft, they cannot contribute to the larger mission of protecting Taiwan.

All of the dynamic modeling methodology (which involved a mix of statistical analysis, Monte Carlo simulation, and modified Lanchester equations) is publicly available and widely used by specialists at U.S. and foreign civilian and military universities.” [Emphasis added].

As TDI has contended before, the problem with using Lanchester’s equations is that, despite numerous efforts, no one has been able to demonstrate that they accurately represent real-world combat. So, even with statistics and simulation, how good are the results if they have relied on factors or force ratios with no relation to actual combat?

What about new capabilities?

As previously posted, the Kratos Mako Unmanned Combat Aerial Vehicle (UCAV), marketed as the “unmanned wingman,” has recently been cleared for export by the U.S. State Department. This vehicle is specifically oriented towards air-to-air combat, is stated to have unparalleled maneuverability, as it need not abide by limits imposed by human physiology. The Mako “offers fighter-like performance and is designed to function as a wingman to manned aircraft, as a force multiplier in contested airspace, or to be deployed independently or in groups of UASs. It is capable of carrying both weapons and sensor systems.” In addition, the Mako has the capability to be launched independently of a runway, as illustrated below. The price for these vehicles is three million each, dropping to two million each for an order of at least 100 units. Assuming a cost of $95 million for an F-35A, we can imagine a hypothetical combat scenario pitting two F-35As up against 100 of these Mako UCAVs in a drone swarm; a great example of the famous phrase, quantity has a quality all its own.

A battery of Kratos Aerial Target drone ready for take off. One of the advantages of the low-cost Kratos drones are their ability to get into the air quickly. [Kratos Defense]

How to evaluate the effects of these possible UCAV drone swarms?

In building up towards the analysis of all of these capabilities in the full theater, campaign level conflict, some supplemental wargaming may be useful. One game that takes a good shot at modeling these dynamics is Asian Fleet.  This is a part of the venerable Fleet Series, published by Victory Games, designed by Joseph Balkoski to model modern (that is Cold War) naval combat. This game system has been extended in recent years, originally by Command Magazine Japan, and then later by Technical Term Gaming Company.

Screenshot of Asian Fleet module by Bryan Taylor [vassalengine.org]

More to follow on how this game transpires!

TDI Friday Read: The Lanchester Equations

Frederick W. Lanchester (1868-1946), British engineer and author of the Lanchester combat attrition equations. [Lanchester.com]

Today’s edition of TDI Friday Read addresses the Lanchester equations and their use in U.S. combat models and simulations. In 1916, British engineer Frederick W. Lanchester published a set of calculations he had derived for determining the results of attrition in combat. Lanchester intended them to be applied as an abstract conceptualization of aerial combat, stating that he did not believe they were applicable to ground combat.

Due to their elegant simplicity, U.S. military operations researchers nevertheless began incorporating the Lanchester equations into their land warfare computer combat models and simulations in the 1950s and 60s. The equations are the basis for many models and simulations used throughout the U.S. defense community today.

The problem with using Lanchester’s equations is that, despite numerous efforts, no one has been able to demonstrate that they accurately represent real-world combat.

Lanchester equations have been weighed….

Really…..Lanchester?

Trevor Dupuy was critical of combat models based on the Lanchester equations because they cannot account for the role behavioral and moral (i.e. human) factors play in combat.

Human Factors In Warfare: Interaction Of Variable Factors

He was also critical of models and simulations that had not been tested to see whether they could reliably represent real-world combat experience. In the modeling and simulation community, this sort of testing is known as validation.

Military History and Validation of Combat Models

The use of unvalidated concepts, like the Lanchester equations, and unvalidated combat models and simulations persists. Critics have dubbed this the “base of sand” problem, and it continues to affect not only models and simulations, but all abstract theories of combat, including those represented in military doctrine.

https://dupuyinstitute.org/2017/04/10/wargaming-multi-domain-battle-the-base-of-sand-problem/