Underwriting Sucks – So I Built an AI That Does It For Me – AI Integration & Machine Learning Solutions | Dynamasoft – Empowering Business with Intelligent Automation

April 20, 2025

How I Used Generative AI to Revolutionize Real Estate Underwriting

By someone who got tired of Excel sheets and decided to build the future.

The Problem with Underwriting Multifamily Deals

If you’ve ever underwritten a multifamily real estate deal, you know the grind. It starts with a 70-page PDF—the Offer Memorandum (OM). Buried inside are the details you need: investment summaries, income and expense tables, unit mixes, pro forma rent rolls. Sometimes it’s clearly structured, but more often than not, it’s a design-heavy, marketing-driven document that makes copy-paste feel like brain surgery.

Multiply that by ten deals a week and suddenly underwriting becomes a bottleneck. For me, it was the pain point I couldn’t ignore any longer. So, I turned to something I’d recently fallen in love with—Generative AI.

Choosing the Capstone: A Personal Mission

When I joined the Generative AI Intensive Capstone program, I knew I wanted to work on something that wasn’t just technically challenging but personally meaningful. I’m in the multifamily investment space. Underwriting is core to my business. And like most entrepreneurs, I hate inefficiency. So I thought:

“What if I could upload an OM and have the AI do 80% of the work for me—pull out the investment summary, extract the rent roll, calculate market comps, and give me a clean summary I can trust?”

That became my north star for this capstone: an AI-powered Deal Analyzer.

The Vision for the Deal Analyzer

The Deal Analyzer is a pipeline. It starts with an OM PDF and ends with an LLM-powered chatbot that can answer investor questions. Here’s the flow I envisioned:

Upload the PDF: User uploads an OM.
Extract Key Data: The system extracts:
- Investment Summary (price, cap rate, GRM, etc.)
- Income & Expenses
- Unit Mix
Enhance with External Data: Get market rent comps from sources like Rentometer or Zillow.
Store in Vector DB: Use FAISS to embed all this information and make it queryable.
Ask Questions: Let users ask natural language questions like:
- “What’s the Year 1 cap rate?”
- “What’s the average market rent for 2-bed units?”
- “Compare the current rent roll with the pro forma.”

Bringing It to Life: Tools & Stack

To bring this vision to life, I needed the right tools. Here’s what powered each layer of the pipeline:

PDF Parsing: PyPDF2 and LangChain Document Loaders
Text Processing: RecursiveCharacterTextSplitter for chunking
Embedding & Vector DB: OpenAI Embeddings + FAISS
LLM Interface: ChatOpenAI via LangChain
Structured Data Models: Pydantic for defining InvestmentSummary and Unit Mix schemas
Graph Execution: LangGraph for chaining logic between nodes
UI Prototype: Jupyter Notebook with I/O simulation

Building the Foundation: Extracting the Investment Summary

The first step was building a robust extractor. I defined a Pydantic model for the investment summary, then fine-tuned a prompt to teach the LLM how to recognize these elements from semi-structured text.

Few-shot examples were crucial here. I showed the model several real-world OMs with different formats and had it return the correct fields.

Sample input looked like this:

yamlINVESTMENT OVERVIEW Price: $3,500,000 Cap Rate: 5.25% GRM: 12.5x

And the model would convert that to:

json{ "price": 3500000, "cap_rate": 5.25, "grm": 12.5 }

Unit Mix and Market Rent Magic

Next, I tackled unit data. Pulling out unit mix (bed/bath/size/rent) is critical to understanding a deal. Once extracted, I enriched the data using predefined logic:

1 bed / 1 bath → $1,300 market rent
2 bed / 1 bath → $1,900 market rent

Eventually, I plan to replace this with an API call to a rent comp service. But for the capstone, I hardcoded logic to simulate enrichment. I then updated the original summary with these market rents, giving me a side-by-side comparison of actual vs. market.

Making the Data Searchable with FAISS

Once I had structured data and full OM content, I needed a way to store and query it. I chose FAISS—a powerful vector database—for its performance and simplicity.

Using OpenAI’s text-embedding-3-small, I created embeddings of the OM and loaded them into FAISS. This allowed me to run similarity searches, retrieving the top 3 chunks related to any query.

Adding the Final Touch: LLM Query with RAG

The cherry on top was the RAG (Retrieval-Augmented Generation) system.

Here’s what it looked like in action:

pythondef ask_vector_db(query: str, top_k: int = 3): results = db.similarity_search(query, k=top_k) return llm.invoke(HumanMessage(content=query + context(results)))

Now the user could type:

“How much is the pro forma net operating income?”
And get a precise, reference-backed answer.

LangGraph for Chained Logic

To tie everything together, I used LangGraph. This let me model the steps as a graph:

read_pdf → extract_summary → get_market_rent → store_to_vectordb

Each step passed an updated state using a Pydantic-backed MessagesState. LangGraph made the execution logical and composable. I even visualized the graph using Mermaid.

Lessons Learned: What AI Can and Can’t Do (Yet)

Along the way, I learned a few things:

Context matters: LLMs struggle without structured context. Embedding + RAG makes them smarter.
Garbage in, garbage out: Poor PDF parsing kills performance. Clean input is everything.
Human-in-the-loop wins: AI should assist, not replace, the underwriter—yet.

But I also learned that:

AI can parse dozens of pages in seconds.
AI can summarize key insights with natural language.
AI can scale your underwriting pipeline beyond human speed.

The Bigger Picture: Scaling Human Intelligence

This project wasn’t just about real estate. It was about reimagining how knowledge work happens. Underwriting is a microcosm of a much bigger trend: the fusion of domain expertise with machine intelligence.

I see a future where:

Lawyers upload contracts and get case strategy.
Doctors upload scans and get clinical summaries.
Investors upload OMs and get full underwriting in minutes.

And I want to be part of building that future.

What’s Next: DealPartners.ai

This capstone isn’t the end. It’s the prototype for my startup: DealPartners.ai.

A platform where investors upload OMs, get instant insights, and collaborate with their teams—all powered by GenAI.

I’m working on:

A React frontend
FastAPI backend
LangGraph orchestration
Real-time chat over vector DB
Insights like IRR projections, break-even analysis, and value-add upside

Final Thoughts: Why This Matters

This wasn’t just a course project. This was a glimpse into the future. And not some distant AI-driven utopia—this is now.

If you’re in real estate, AI can already change the way you operate.
If you’re in any data-driven industry, you’re next.

This project showed me that AI isn’t a buzzword anymore. It’s a tool. A partner. A second brain. And it’s here.

Thanks, Google, for creating this opportunity.

I started this project tired of underwriting.
I ended it inspired by what’s possible.

And I’m just getting started.

Get News & Updates