Category Modeling, Simulation & Wargaming

U.S. Senate on Model Validation

This link is to report 116-48, The National Defense Authorization Act for Fiscal Year 2020 by the Committee on Armed Services, United States Senate, dated June 11, 2019: https://www.congress.gov/116/crpt/srpt48/CRPT-116srpt48.pdf

While this 609 page report is probably worth reading from cover to cover, that I have not done. On the other hand, pages 253-254 are worth quoting (the bolding in the text is mine):

Evaluation of modeling and simulation used for force planning and theater operational requirements

The committee notes that the Department of Defense uses a large number and variety of computer models and simulations to support decision-making about force structure, resource allocation, war gaming, and priority weapons platforms and technologies to develop and deploy in support of likely operational scenarios. These models are used to develop information to brief decision makers, including the Congress, about, for example, the current state of the balance of forces in the Pacific and European theaters, the outcomes of likely war scenarios, and the need for investments in advanced technologies and new warfighting capabilities.

The committee is concerned that the quality, accuracy, and dependability of these models, given their important role in decision making processes, has not been adequately validated. The committee notes that technical and engineering models used to develop systems such as body armor and missiles are rigorously verified and validated for veracity of assumptions and technical accuracy using real world data. Other models, such as those used in the financial sector, are developed using expertise from a variety of disciplines, including economics, sociology, and advanced mathematics. The committee is concerned that the models used by organizations including the Joint Staff, Office of Net Assessment, war colleges, and service-level planning entities are simplistic by comparison and not subject to the same level of scrutiny.

Therefore, the committee directs the Secretary of Defense to establish an independent team of academic and industry modeling and subject matter experts to review the quality of modeling and simulation used for force planning, war gaming, resource allocation, and other senior leader decision-making associated with implementation of the National Defense Strategy. The team shall review the technical quality of models currently in use, including their ability to simulate as required by application; physics and engineering, socio-economic impact, readiness, global financial markets, politics, and other relevant inputs and outputs. The team shall assess the quality of these models and make recommendations for investments or policy changes needed to enhance and continuously validate current and future modeling and simulation tools employed to enable senior-level decision-making.

The committee directs the Secretary to support the team with expertise as needed from the Joint Staff, Office of Net Assessment, Under Secretary of Defense for Research and Engineering, and other relevant organizations. The Secretary shall ensure that the team has sufficient resources and access to all data and records necessary to perform its analysis. The committee directs the Secretary to deliver a report on the independent team’s assessments and recommendations with any additional comments, and a specific concurrence or non-concurrence for each recommendation, to the Committees on Armed Services of the Senate and House of Representatives no later than December 31, 2020.

A big thank you to the person who brought this to my attention.

Signal Multi-player Game

Christopher A. Lawrence
May 15, 2019

This was just flagged to me by one of our readers in the UK: https://phys.org/news/2019-05-science-wargames.html

It is a multi-player game developed by researchers at University of California, Lawrence Livermore and Sandia. It was done for the Carnegie Corporation, a non-profit: https://www.carnegie.org/ and https://en.wikipedia.org/wiki/Carnegie_Corporation_of_New_York

They have an open play window every Wednesday and Thursday 1 to 5 PM Pacific Time (4 – 8 EST). The link is here: https://www.signalvideogame.com/

I know nothing about this effort. An image of it is at the top of this blog post. Looks like a fairly typical hex game.

Modeling, Simulation & Wargaming

Breakpoints

A couple of our posts on Breakpoints (forced changes in posture) are getting a lot of hits lately. Not sure why or by who. Let me list all of our posts addressing the issue of breakpoints:

What Is A Breakpoint?

Response 3 (Breakpoints)

Breakpoints in U.S. Army Doctrine

C-WAM 4 (Breakpoints)

Diddlysquat

Engaging the Phalanx (part 7 of 7)

It is also discussed in my book War by Numbers, pages 287-289 and briefly mentioned on page 291.

Oh…and here also (forgot about this one as I only did a search on the word “breakpoint”):

Battle Outcomes: Casualty Rates As a Measure of Defeat

Throwing the Dice

This is a follow-up to the blog post:

China and Russia Defeats the USA

One of readers commented on this post and posted the following link: Paper Wargames and Policy Making

In case you are not reading the comments to our blog post, wanted to make sure this was brought to your attention. The article is worth taking a look at just for the pictures. A few highlights:

“In my lifetime, computer-based simulation have largely taken over analytical gaming, sometimes bringing new levels of investigative power, but often just providing the illusion of it as the details of the models and their simplifying assumptions become invisible to players and to the policymakers whose decisions the games are supposed to inform.”
However, it gradually became clear [with the Baltic states wargame] — rather disconcertingly–that we were out in front of most of the official planning, not following in its wake.”
“The resulting game….resolves combat using 12-hour turns and 10-km hexes; units are battalions of ground forces, SAM batteries, and half-squadrons of aircraft (12 fighters or six bombers).”
“…and they like throwing the dice.”

Summation of our Validation Posts

This extended series of posts about validation of combat models was originally started by Shawn Woodford’s post on future modeling efforts and the “Base of Sand” problem.

Wargaming Multi-Domain Battle: The Base Of Sand Problem

This post apparently irked some people at TRADOC and they wrote an article in the December issue of the Phalanx referencing his post and criticizing it. This resulted in the following seven responses from me:

Engaging the Phalanx

Validation

Validating Attrition

Physics-based Aspects of Combat

Historical Demonstrations?

SMEs

Engaging the Phalanx (part 7 of 7)

This was probably overkill…..but guys who write 1,662 page books sometimes tend to be a little wordy.

While it is very important to identify a problem, it is also helpful to show the way forward. Therefore, I decided to discuss what data bases were available for validation. After all, I would like to see the modeling and simulation efforts to move forward (and right now, they seem to be moving backward). This led to the following nine posts:

Validation Data Bases Available (Ardennes)

Validation Data Bases Available (Kursk)

The Use of the Two Campaign Data Bases

The Battle of Britain Data Base

Battles versus Campaigns (for Validation)

The Division Level Engagement Data Base (DLEDB)

Battalion and Company Level Data Bases

Other TDI Data Bases

Other Validation Data Bases

There were also a few other validation issues that had come to mind while I was writing these blog posts, so this led to the following series of three posts:

Face Validation

Validation by Use

Do Training Models Need Validation?

Finally, there were a few other related posts that were scattered through this rather extended diatribe. It includes the following six posts:

Paul Davis (RAND) on Bugaboos

Diddlysquat

TDI Friday Read: Engaging The Phalanx

Combat Adjudication

China and Russia Defeats the USA

Building a Wargamer

That kind of ends this discussion on validation. It kept me busy for while. Not sure if you were entertained or informed by it. It is time for me to move onto another subject, not that I have figured out yet what that will be.

Building a Wargamer

Interesting article from Elizabeth Bartels of RAND from November 2018. It is on the War on the Rocks website. Worth reading: Building a Pipeline of Wargaming Talent

Let me highlight a few points:

“On issues ranging from potential conflicts with Russia to the future of transportation and logistics, senior leaders have increasingly turned to wargames to imagine potential futures.”
“The path to becoming a gamer today is modeled on the careers of the last generation of gamers — most often members of the military or defense analysts with strong roots in the hobby gaming community of the 1960s and 1970s.”
1. My question: Should someone at MORS (Military Operations Research Society) nominate Charles S. Roberts and James F. Dunnigan for the Vance R. Wanner or the Clayton J. Thomas awards? (see: https://www.mors.org/Recognition).
One notes that there is no discussion of the “Base of Sand” problem.
One notes there is no discussion of VVA (Verification, Validation and Accreditation)
The picture heading her article is of a hex board overlaid by acetate.

Do Training Models Need Validation?

Do we need to validate training models? The argument is that as the model is being used for training (vice analysis), it does not require the rigorous validation that an analytical model would require. In practice, I gather this means they are not validated. It is an argument I encountered after 1997. As such, it is not addressed in my letters to TRADOC in 1996: See http://www.dupuyinstitute.org/pdf/v1n4.pdf

Over time, the modeling and simulation industry has shifted from using models for analysis to using models for training. The use of models for training has exploded, and these efforts certainly employ a large number of software coders. The question is, if the core of the analytical models have not been validated, and in some cases, are known to have problems, then what are the models teaching people? To date, I am not aware of any training models that have been validated.

Let us consider the case of JICM. The core of the models attrition calculation was the Situational Force Scoring (SFS). Its attrition calculator for ground combat is based upon a version of the 3-to-1 rule comparing force ratios to exchange ratios. This is discussed in some depth in my book War by Numbers, Chapter 9, Exchange Ratios. To quote from page 76:

If the RAND version of the 3 to 1 rule is correct, then the data should show a 3 to 1 force ratio and a 3 to 1 casualty exchange ratio. However, there is only one data point that comes close to this out of the 243 points we examined.

That was 243 battles from 1600-1900 using our Battles Data Base (BaDB). We also tested it to our Division Level Engagement Data Base (DLEDB) from 1904-1991 with the same result. To quote from page 78 of my book:

In the case of the RAND version of the 3 to 1 rule, there is again only one data point (out of 628) that is anywhere close to the crossover point (even fractional exchange ratio) that RAND postulates. In fact it almost looks like the data conspire to leave a noticeable hole at that point.

So, does this create negative learning? If the ground operations are such that an attacking ends up losing 3 times as many troops as the defender when attacking at 3-to-1 odds, does this mean that the model is training people not to attack below those odds, and in fact, to wait until they have much more favorable odds? The model was/is (I haven’t checked recently) being used at the U.S. Army War College. This is the advanced education institute that most promotable colonels attend before advancing to be a general officer. Is such a model teaching them incorrect relationships, force ratios and combat requirements?

You fight as you train. If we are using models to help train people, then it is certainly valid to ask what those models are doing. Are they properly training our soldiers and future commanders? How do we know they are doing this. Have they been validated?

Validation by Use

Sacrobosco, Tractatus de Sphaera (1550 AD)

Another argument I have heard over the decades is that models are validated by use. Apparently the argument is that these models have been used for so long, and so many people have worked with their outputs, that they must be fine. I have seen this argument made in writing by a senior army official in 1997 in response to a letter addressing validation that we encouraged TRADOC to be send out:

See: http://www.dupuyinstitute.org/pdf/v1n4.pdf

I doubt that there is any regulation discussing “validation by use,” and I doubt anyone has ever defended this idea in public paper. Still, it is an argument that I have heard used far more than once or twice.

Now, part of the problem is that some of these models have been around a few decades. For example, the core of some of the models used by CAA, for example COSAGE, first came into existence in 1969. They are using a 50-year updated model to model modern warfare. My father worked with this model. RAND’s JICM (Joint Integrated Contingency Model) dates back to the 1980s, so it is at least 30 years old. The irony is that some people argue that one should not use historical warfare examples to validate models of modern warfare. These models now have a considerable legacy.

From a practical point of view, it means that the people who originally designed and developed the model have long since retired. In many cases, the people who intimately knew the inner workings of the model have also retired and have not really been replaced. Some of these models have become “black boxes” where the users do not really know the details of how the models calculate their results. So suddenly, validation by use seems like a reasonable argument, because these models pre-date the analysts, and they assume that there is some validity to them, as people have been using them. They simple inherited the model. Why question it?

Illustration by Bartolomeu Velho, 1568 AD

China and Russia Defeats the USA

A couple of recent articles on that latest wargaming effort done by RAND:

https://www.americanthinker.com/blog/2019/03/rand_corp_wargames_us_loses_to_combined_russiachina_forces.html

The opening line states: “The RAND Corporation’s annual ‘Red on Blue’ wargame simulation found that the United States would be a loser in a conventional confrontation with Russia and China.”

A few other quotes:

“Blue gets its ass handed to it.”
“…the U.S. forces ‘suffer heavy losses in one scenario after another and still can’t stop Russia or China from overrunning U.S. allies in the Baltics or Taiwan:”

Also see: https://www.asiatimes.com/2019/03/article/did-rand-get-it-right-in-its-war-game-exercise/

A few quotes from that article:

“The US and NATO are unable to stop an attack in the Balkans by the Russians,….
“…and the United States and its allies are unable to prevent the takeover of Taiwan by China.

The articles do not state what simulations were used to wargame this. The second article references this RAND study (RAND Report) but my quick perusal of it did not identify what simulations were used. A search on the words “model” and “wargame” produced nothing. The words “simulation” and “gaming” leads to the following:

“It draws on research, analysis, and gaming that the RAND Corporation has done in recent years, incorporating the efforts of strategists, regional specialists, experts in both conventional and irregular military operations, and those skilled in the use of combat simulation tools.”
“Money, time, and talent must therefore be allocated not only to the development and procurement of new equipment and infrastructure, but also to concept development, gaming and analysis, field experimentation, and exploratory joint force exercises.”

Anyhow, curious as to what wargames they were using (JICM – Joint Integrated Contingency Model?). I was not able to find out with a cursory search.

Face Validation

The phrase “face validation” shows up in our blog post earlier this week on Combat Adjudication. It is a phrase I have heard many times over the decades, sometimes by very established Operation Researchers (OR). So what does it mean?

Well, it is discussed in the Department of the Army Pamphlet 5-11: Verification, Validation and Accreditation of Army Models and Simulations: Pamphlet 5-11

Their first mention of it is on page 34: “SMEs [Subject Matter Experts] or other recognized individuals in the field of inquiry. The process by which experts compare M&S [Modeling and Simulation] structure and M&S output to their estimation of the real world is called face validation, peer review, or independent review.”

On page 35 they go on to state: “RDA [Research, Development, and Acquisition]….The validation method typically chosen for this category of M&S is face validation.”

And on page 36 under Technical Methods: “Face validation. This is the process of determining whether an M&S, on the surface, seems reasonable to personnel who are knowledgeable about the system or phenomena under study. This method applies the knowledge and understanding of experts in the field and is subject to their biases. It can produce a consensus of the community if the number of breadth of experience of the experts represent the key commands and agencies. Face validation is a point of departure to determine courses of action for more comprehensive validation efforts.” [I put the last part in bold]

Page 36: “Functional decomposition (sometimes known as piecewise validation)….When used in conjunction with face validation of the overall M&S results, functional decomposition is extremely useful in reconfirming previous validation of a recently modified portions of the M&S.”

I have not done a survey of all army, air force, navy, marine, coast guard or Department of Defense (DOD) regulations. This one is enough.

So, “face validation” is asking one or more knowledgeable (or more senior) people if the model looks good. I guess it really depends on whose the expert is and to what depth they look into it. I have never seen a “face validation” report (validation reports are also pretty rare).

Who’s “faces” do they use? Are they outside independent people or people inside the organization (or the model designer himself)? I am kind of an expert, yet, I have never been asked. I do happen to be one of the more experienced model validation people out there, having managed or directly created six+ validation databases and having conducted five validation-like exercises. When you consider that most people have not done one, should I be a “face” they contact? Or is this process often just to “sprinkle holy water” on the model and be done?

In the end, I gather for practical purposes the process of face validation is that if a group of people think it is good, then it is good. In my opinion, “face validation” is often just an argument that allows people to explain away or simply dismiss the need for any rigorous analysis of the model. The pamphlet does note that “Face validation is a point of departure to determine courses of action for more comprehensive validation efforts.” How often have we’ve seen the subsequent comprehensive validation effort? Very, very rarely. It appears that “face validation” is the end point.

…

Is this really part of the scientific method?

…