We are going to hand the forum over to Geoffrey Clark for one day a week. Mr. Clark has posted here before and presented at the last two HAACs. He will end up doing a series of posts each Wednesday on Modern Air Combat Data. This is his introductory post:
—————————-
Air Combat Data, modern Air Warfare, Ukraine and AI
Many are following the war in Ukraine, with intense detail. I have attempted to gather meaningful statistics for a cliometrical analysis, to add some analytical rigor to the debates about the relative effectiveness of the new F-16 aircraft recently deployed by the Ukrainian Air Force. I’ve been looking at the sources available for this type of analysis, for example the Wikipedia page, or the Statista comparison page, and I have been sorely disappointed in what is currently available. There is a wealth of information available about what was lost, including serial numbers, etc., but the process by which it was lost, or the why, is so far simply not available. This type of information is typically not shared at all for years or decades after a conflict, as it might compromise the effectiveness of the related Air Forces in a future conflict. This is why there remains ongoing analysis of the Korean and Vietnamese conflicts, as new information emerges, and allows for the better correlation and cross-checking of Claims and Losses.
I’m following in the footsteps of John Stillion, referencing his famous RAND “clubbing baby seals” brief on the F-35, as well as his more accessible work at CBSA, Trends in Air-to-Air Combat (2015). I’ve been gathering the detailed claims and losses data for air combat in the jet age. Sometimes, this data is made openly available by the participating Air Forces and Air Defense Forces, but often comes from some less than official sources, like hobbyist websites, or even ejection seat manufacturer websites. Nonetheless, with proper scrutiny and comparison across sources, it can produce some important insights into the air combat process, and thereby give some predictive power for future conflict in the air.
It is also important to get the first-hand narratives by the pilots, airmen and soldiers involved in the conflict. I believe this is exceedingly important to understand the context of the combat situation and what was known at the time that decisions were made, and how this led to the outcomes. The idea that AI based on LLMs with a lot of data (from the internet? From a bunch of air combat games or simulations?) can produce a fighter pilot like capability to make decisions in the heat of battle … is fraught at best. A clean, curated, reliable, accurate dataset is needed now more than ever. I’ve recently watched a presentation by Admiral Grace Hopper from 1982, excellent foresight into data processing and information flows! In order to train future AI agents effectively, the relative value of actual combat data from real war must be prioritized much more than data from exercises, and especially above simulation data, and “internet” data, whatever that is.
Therefore, I’ll start a series of blog posts to explore this topic, as I progress through the analysis of data available on various conflicts in the 20th and 21st centuries. I’ll develop some primitive AI agents for air combat and simulate air combat based upon this solid data foundation. I’ll also postulate various what-ifs and wargame scenarios, both in the air domain, and multi-domain combat scenarios.
I’m pleased this year to be able to attend in person the HAAC 2024 conference to present these ideas and get some insight and support from the other attendees.
Thanks for reading, and for your comments!