All posts
Jul 8, 20257 min readAfrasayab MahsudResearch

Building the First AI Integration for SAP 2000

The first post in a series. The prototype is an MCP server bolted onto SAP 2000. Here is what it took to make a language model drive a structural analysis package — 2,000 OAPI methods, two LLMs, and one obstinate idea.

It is July 2025 as I write this. ConGro AI does not exist yet — not as a product, not as a name. What exists is a working prototype: a Model Context Protocol (MCP) server I have been building on weekends that lets Claude drive SAP 2000 in plain English. As far as I can find, it is the first AI integration with SAP 2000.

I am posting this for two reasons. First, because if anyone else is staring at the same problem — a structural analysis package with a deep API and a workflow that mostly punishes patience — you should know the path is real. Second, because writing it down forces me to admit which decisions were lucky and which were earned.

The thesis

Pick any week in a structural engineer’s life. Open SAP 2000 or ETABS. Define the grid. Define materials. Define sections. Draw beams, columns, slabs. Apply loads. Run analysis. Run design checks. Iterate. Most of these steps do not require engineering judgment — they require persistence, and persistence scales linearly with project size.

My thesis: a large language model that can call SAP 2000’s OAPI directly should be able to compress the mechanical 80% of that work by an order of magnitude, while leaving the engineer in charge of the judgment that actually matters. The hard part was that, at the time, no AI could talk to SAP 2000 at all.

Why MCP

Anthropic released the Model Context Protocol earlier this year. MCP gives an LLM a standardized way to call external tools — file systems, databases, custom servers. SAP 2000’s OAPI (Open Application Programming Interface) exposes roughly 2,000 functions for programmatic control: create joints, define load patterns, run static analysis, extract section results. If I could wrap those functions as MCP tools, an LLM could in theory build a full structural model by composing them.

Theory was easy. The hard parts were:

  • Which of the ~2,000 OAPI methods are actually usable from a tool-calling LLM?
  • How do you describe each method to the model precisely enough that it picks the right one for a given user request?
  • How do you handle the fact that SAP 2000 is a stateful Windows application, not a stateless HTTP service?
  • What happens when the model gets it wrong — and it will get it wrong, often?

The 1,000-function audit

I wrote a separate harness — call it the cataloger — that walked every OAPI method, parsed its parameter list, called it with synthetic inputs, captured return types and side effects, and produced a structured description for each one. Roughly 1,000 of the ~2,000 methods turned out to be viable for tool calling. The rest were UI-only wrappers, stateful in ways an LLM could not reason about, or paper-thin variants of methods already exposed elsewhere.

For each viable method, the cataloger emitted: a plain-English description, the exact parameter schema, the expected return shape, and a list of natural phrasings that should map to the tool. The harness itself ran for several days on my workstation. The output is what eventually became the source-of-truth catalog for the MCP server.

Two models, two strategies

I tested two LLMs end-to-end as the driver: Claude Sonnet 4 and GPT-4.1. Different strengths surfaced fast.

  • Claude Sonnet 4 was stronger on long, multi-step builds. It kept model state coherent across 30+ tool calls without losing track of joint IDs, frame assignments, or section properties.
  • GPT-4.1 was faster on individual tool calls but tended to fabricate parameter values when the build state got complex.

For the prototype I went with Sonnet 4 as the primary driver. I am writing this post inside Claude Desktop, with Windsurf and other MCP-aware clients connected to the same server. The same architecture should work for any MCP-capable client — that portability was the whole point of choosing MCP over a bespoke integration.

Phase 1 — getting a model to load

The first goal was embarrassingly small: open SAP 2000, create a new model, define one steel section, draw a single column, run a static analysis, and report the base reaction. It took longer than I want to admit.

The model kept choosing the wrong overload of similarly-named methods. It kept passing degree values where radians were expected. It assumed SAP 2000’s coordinate system was right-handed when SAP 2000 actually treats Z as up. Most of these were docstring problems on my end, not model problems — every time I improved the description for a tool, that whole class of mistakes disappeared.

When the first end-to-end run came back with a base reaction matching what I had computed by hand, that was the moment the project felt real. Phase 1 was done: session management, automated software control, file operations, and coordinate-based modeling for lines and areas — all driven by natural language.

Phase 2 — geometry that represents an actual building

Phase 2 is in flight as I write this. The next layer is real geometry: multi-bay frames, sloped roofs, areas (slabs and walls), solid elements, and cable elements. Each one introduces its own gotchas:

  • Areas need joint ordering consistent with a chosen normal direction or you get nonsense local axes
  • Solid elements need 8 joints in a strict winding order
  • Cable elements only behave correctly after their endpoint joints have proper boundary conditions

I added a pre-flight validator — separate from the LLM — that runs every proposed model through a sanity check before any OAPI calls fire. That cut the iteration loop from "LLM proposes -> SAP errors -> LLM retries" to "LLM proposes -> validator catches -> LLM fixes" in roughly one round trip. Most of the speed in the prototype comes from that validator, not from the model itself.

The roadmap

The plan covers structural workflows end-to-end:

  • Phase 1 — Session, file, and coordinate-level modeling (done)
  • Phase 2 — Full geometry: points, frames, areas, solids, cables (in development)
  • Phase 3 — Load patterns, cases, and combinations
  • Phase 4 — Analysis execution and post-processing
  • Phase 5 — Code-based design checks (ASCE 7, ACI 318, AISC 360, and international equivalents)

CSI Labs ships a remarkably consistent OAPI across SAP 2000, ETABS, and CSiBridge. Once the framework is complete for one product, it should extend to the others with mostly-mechanical work. Autodesk Robot Structural Analysis is a likely fourth target — different API, same pattern.

What this is for

The practical applications I keep coming back to:

  • Natural-language structural modeling — describe the building, watch the model appear
  • Automating the repetitive 80% so engineers can spend time on the 20% that requires judgment
  • Streamlined geometry creation for early-stage iteration, when you want to compare schemes quickly
  • Reducing manual input errors — the validator catches most before they reach the solver

The setup demonstrates direct AI-to-engineering-software communication. It is not a chatbot describing what to click. It is a system that can drive the analysis package itself, then read the output and act on it.

Acknowledgements

Computers and Structures, Inc. — thanks for building software with an OAPI surface area robust enough to enable an integration like this. The fact that 1,000 of 2,000 methods are usable from a tool-calling LLM is a credit to API design, not just to me. The remaining gaps (UI-only flows, undocumented state) are the kind of things that get filed as feature requests, not roadblocks.

What’s next

Next post will document Phase 2 closing out — full geometry pipelines and the validator architecture. After that, Phase 3 (loads and combinations) and the first end-to-end design report.

If you build with structural analysis software and want to be on the list when this becomes a product, the easiest way is to follow along. There is more coming.