Applying AI to Small Multifamily Real Estate

A real estate sponsor’s firsthand look at the successes and challenges of using AI in a small multifamily real estate portfolio

Chris Lehman

and

Seth Priebatsch

Aug 27, 2024

This essay is cross-posted from Thesis Driven with permission.

Applying AI, the dominant tech story of the past year, to real estate, the largest economic sector in the world, is an obvious thought. At Groma, we’ve been testing the use of AI across nearly every stage of our vertically integrated real estate system, finding where it works, where it really works, and where it has generated a lot of hype but still struggles to add much value under current conditions.

As a small multifamily sponsor, we acquire and manage properties that are geographically separated and often over 100 years old, making them expensive to manage using traditional methods—opex ratios for other operators typically range from 40-50%. This makes efficiency-boosting innovation critical for producing institutional-grade returns, and AI infrastructure has become a key part of our tech stack. Today, we’ll share some of the ways it has made our portfolio easier to manage as well as how we’re planning to keep developing it as AI improves and we find more real-world training data.

Specifically, we’ll answer the following questions:

How are we currently using AI to improve operations, and how do we expect it to be used as it evolves?
What does this mean for real estate more broadly?
What obstacles stand in the way of broader adoption of AI in real estate?

What is Grobot?

Together, we’ve named our AI tools Grobot. Grobot is a customGPT model built using the OpenAI platform and other technologies that we’ve trained with data from our real estate ecosystem and operational platform. In addition to standard background prompting regarding context, tone, and objectives, we feed Grobot information from across our ecosystem: process guides, appliance manuals, vendor relationships, smart locks, thermostats, city services, and other sources.

Once primed with these resources, our in-house technology platform is able to interact via API with Grobot in order to complete the appropriate stages of our workflows. The platform interacts with Grobot via a set of custom prompts that map to our different use cases–for instance, an incoming maintenance request from our resident portal would result in Grobot being prompted with a combination of the text and metadata of that request combined with a standard maintenance framing prompt. We dynamically populate additional context required for Grobot to make decisions in the prompts as needed, e.g. relevant information about the resident’s maintenance history. Based on Grobot’s responses, the platform then takes actions within our ecosystem, such as looping in a human agent or responding to the resident directly. At this point, some of these actions are fully automated, but the majority are still routed through our human agents for review.

An example Grobot prompt. Grobot often requires significant cajoling to output solely JSON rather than JSON + plain text.

In the future, as we continue to test and refine Grobot’s output, the training wheels will come off and Grobot will be able to take more and more actions without a human in the loop. The most complex, nuanced, or sensitive issues may always require a human touch, as will activities that require engagement in the physical world, such as maintenance.

Current Applications

As noted in our earlier essay, rental properties that are scattered and old come with high management costs relative to single large properties and new development. The primary component of these costs at almost every stage–acquisitions, renovation, leasing, maintenance–is labor, whether of Groma team members or external contractors. This means that using labor as efficiently as possible is critical to improving our opex ratio, especially as our portfolio grows and expands into new markets.

Acquisitions

Grobot’s impact begins with our acquisitions process. The economics of acquiring smaller properties means that traditional labor-intensive acquisitions processes don’t work in our market segment. Each hour one of our human agents spends on evaluating potential acquisition targets costs $60-70. There are 90,000 small-cap (2-20 unit) multifamily properties in the Greater Boston area. We are theoretically interested in all of them (at the right price). A traditional analyst might take 2-4 hours to build a desktop model for a given property, and it might require updating quarterly. The theoretical costs don’t actually matter at that point; basically, it’s unaffordable, and one of the reasons why scaled players have not entered the small-cap multifamily space.

Given that we evaluate roughly 200 properties for every acquisition we end up making, even if we only spend two human hours evaluating each property that we ultimately pass on, we’d end up adding $26,000 to each acquisition—roughly 2% of the total acquisition price. This is a significant extra cost to add to each acquisition, and it’s part of why the small-cap multifamily space currently operates more on local networks and “deal feel” rather than a programmatic evaluation of all opportunities.

For context, we spend 80 human hours (on-site diligence, financial model tuning, renovation planning) of human analysis on each property that we acquire—roughly $6,000. This, too, is made more efficient by our technology platform than it otherwise would be, but it can only be reduced so much. Nothing replaces physical diligence.

We first approached this high-scale/low-hit-rate problem by building a data-scraping pipeline to ingest all properties, then using Grobot to run models using first- and third-party data to filter out properties based on a wide variety of criteria, including both typical metrics like unit/room count and neighborhood comps as well as more esoteric factors like inter-sale duration or plumbing layout. This allowed us to exclude roughly 95% of properties within our initial candidate pool, saving analyst time on the order of hundreds of dollars per property. Grobot can perform at the level of a mid-range analyst. The poor soul sitting behind a screen running thousands of similar models wasn’t going to be doing a top-tier job anyways. Grobot isn’t the best real estate analyst available to hire, but it’s better than the analyst who could be hired to do this job, and it’s getting better all the time.

As our track record has grown longer and our portfolio has grown larger, we’ve had the opportunity to gather more data on the financial performance of our properties post-acquisition. We can then evaluate Grobot’s predictions of the impact of different property characteristics on financial performance against this real-world evidence, allowing Grobot to update its weightings and make better predictions in the future.

Property Management

Those of you with experience in property management will be aware that tenant communication can be an extremely time-consuming process. While these labor hours are less expensive than those in acquisitions, they still add up. Coordinating tours, processing lease applications, and fielding maintenance requests can all involve significant back-and-forth, much of which is too variable for a hardwired/deterministic algorithm but sufficiently predictable that a properly trained LLM can handle it.

Groma receives hundreds of inbound tenant and broker communications per week via a variety of channels: SMS, email, resident portal submissions, voicemail, and live phone calls. Grobot helps out during several stages of this process.

A stylized Grobot-mediated maintenance process flow

First, Grobot automatically scans the content of messages, summarizes them, and assigns them a category (e.g. “leasing inquiry” or “maintenance request”) and a priority level (low, medium, high). Voicemail transcription and summarization saves considerable time on its own, and priority and category assignment ensure that communications are processed by the right people and in the right order. Deduplication is another important value add here—tenant requests are sometimes sent repeatedly or via multiple channels, but Grobot aggregates them across channels and imperfect matches at a near-human level of accuracy, ensuring that our human agents have all the necessary context and preventing multiple agents from accidentally working on the same task in parallel.

Second, if the message involves a maintenance request, Grobot can parse the request, consult the relevant appliance manual, and propose an appropriate solution within a few seconds. It’s therefore not only more cost-efficient than human agents, but also much quicker, resulting in faster resolution of problems and a better tenant experience.

Third, Grobot is trained in fair housing laws and other local regulations. So are our human agents, of course, but humans can make mistakes, especially in the ever-growing maze of regulations that govern our industry. Computer programs can make mistakes too, but Grobot outperforms humans in accuracy and rule adherence. A combination of Grobot responses, and human review where appropriate, can ensure compliance with nuanced laws at lower costs and a higher level of accuracy than even the best training program can achieve.

Lastly, Grobot is now responding to requests for apartment showings, one of the simpler and more standardized tasks, without the need for any humans in the loop. This is only about 5% of our overall communications today, but its strong performance here so far suggests the potential for its deployment into a much broader range of communications going forward.

Future Applications

Playbook Integration

At Groma, we obsessively codify best practices for a staggering variety of different tasks into a library of hundreds of digital playbooks. These playbooks facilitate the diffusion of hard-earned knowledge both vertically and laterally across the company.

For example, we have playbooks for:

Conducting apartment showings, including prep, live showings, and follow-ups
How to use Matterport software and hardware to create virtual tours for our properties
Interacting with our residents and city agencies to solve common waste management problems
The year-long leasing cycle, centered around Boston’s university-driven 9/1 mass turnover
Software engineering team rituals, e.g. morning standup, cycle priorities, product sync, and management 1:1s

Over the coming months, we plan to begin testing Grobot on its ability to accurately fulfill the requirements of these playbooks. This will entail evaluating Grobot’s outputs for a given playbook relative to high-quality human ones, providing reinforcement learning from human feedback to refine these outputs, and modifying/formalizing the writing style and data format of the playbooks themselves to speed up this training process as we move through our library. The goal of this process is to give Grobot the ability to address as many property management tasks as possible without compromising on quality relative to a human agent.

Immediate Escalation

Running all resident communications in house enables faster escalation paths for severe incidents. Groma has a 24/7 emergency line and multiple mechanisms for our residents to reach us. But during periods of stress, sometimes protocols aren’t followed, and a low-urgency ticket might be filed for an incident that we believe to be high-urgency.

Grobot never sleeps, is always monitoring all channels, and can identify and escalate likely high-priority issues for human review. This also helps us cover low-occurrence, high-importance scenarios. For example, a report of a toilet running might not strike a resident as an emergency (annoying, but they’ll get to it in the morning), but we know that can run up $1000s in water bills or cause leaks elsewhere in the buildings, and we would prefer to dispatch and fix that immediately. Most of our residents pay for their own water but still might not realize this cost risk. Grobot knows, and helps us act on their behalf quickly and effectively.

We can train our team to be on the lookout for these tasks. But there are thousands of low-likelihood, high-importance cases that can happen any time of day or night. Grobot can handle that with near-perfect recall. Humans need sleep, require training per human, and often need re-training.

Predictive Renovation

As with many acquisitions of existing property, our financial model includes a capital budget item for in-unit and property-level improvements, assuming that many properties will produce better returns (higher rents, lower operating costs) with some upfront improvements. The determination of whether renovation was needed, how much to spend, and on which specific improvements was initially driven to a non-trivial degree by human intuition–a subjective understanding of the relationship between the property’s condition and broader neighborhood demand.

This approach worked well enough—our renovated properties experience notable income gains—but there was room for further predictive refinement. We might observe a rent increase following the application of our Standard renovation package, but how do we know the cost/benefit balance wouldn’t have been better with our Plus package?

Our renovation sample set isn’t yet big enough to enable Grobot to make accurate predictions about which renovation tier is financially optimal, but as we perform more renovations and as the financial impacts of those renovations have more time to play out, we will begin training Grobot to make inferences based on the patterns that emerge from that experience. We expect that Grobot will be able to integrate information about renovation costs and rent growth effects with pre-existing variables like building condition and neighborhood characteristics, taking much of the guesswork out of the current renovation process.

Implications for Real Estate

Grobot is in its infancy, yet its voicemail summarization alone saves us roughly one full-time employee’s worth of time across a portfolio of roughly 350 units. We estimate its work on our leasing workflow to save an additional full time employee today, and that will scale as it engages more of our workflow.

Zooming out to the top level, a major goal for Grobot is to enable us to operate small multifamily assets at the same level of operational efficiency as is typically achieved with large multifamily assets. As of 2024, we have proven that we can do this in terms of both operating expense ratio (~32%) and units per team member, matching the “golden rule” of one management and one maintenance staff member per 100 units.1

The biggest change we expect to see over the coming years is the institutionalization of small multifamily rental properties. The first wave of institutional real estate investment focused almost entirely on large properties—residential or otherwise—with low opex ratios stemming from their scale. Small properties, both single-family and multifamily, were ignored, as their costly nature made them non-viable.

The global financial crisis of 2007-08 changed this by creating favorable conditions for investors to acquire single-family properties in bulk and experiment with novel technologies and management practices to make them more efficient. Single-family rentals (SFR) are now a major institutional investment category. Over the past few years, a combination of macroeconomic factors and the ability to improve on the SFR playbook have created a similar opportunity in small multifamily, bolstered by new AI capabilities.

In a competitive, well-functioning housing market, the existence of this technology would result in these cost savings being passed on to renters in the form of lower equilibrium rents. Unfortunately, even when operating costs go down, regulatory supply constraints2 result in a disproportionate share of these economic gains accruing to rental housing owners, with relatively modest decreases in rental rates compared to a more dynamic supply environment. Reforming overly burdensome land use regulations is therefore a crucial step towards enabling renters to enjoy a greater share of the benefits of AI property management.

At the same time, housing supply isn’t perfectly inelastic, and factors other than regulation still meaningfully constrain new housing production. Especially in the current environment of high interest rates, labor costs, and material costs, many new developments that have secured regulatory approval have nevertheless been put on hold, as their financial models relied on lower all-in input costs. This means that improvements in operating efficiency such as those enabled by Grobot could unlock new supply creation on the margin, even if these gains are smaller than they would be under saner regulatory systems.

Obstacles to Adoption

An important caveat to the rosy picture painted above is that we do not expect this kind of solution to be universally applicable, or at least not to the same degree of effectiveness. Grobot relies on the existence of an integrated property system that gives it both a high degree of access to interconnected, mutually compatible software systems and accurate knowledge of the physical components of the properties under management.

A tenant in a Groma property who calls about their broken microwave can receive a quick and accurate Grobot-generated response because Grobot already knows which microwave model they have, the statistically optimal next step towards resolving their particular issue based on past experience, and the correct escalation path in the event that the first step is inadequate. Depending on the nature of the issue, a maintenance tech can be dispatched within a minute, or the tenant can be given the information to solve the problem themself.

Contrast this with an off-the-shelf AI property manager (or, more likely, a collection of several of these connected with each other and with non-AI third-party software solutions) tasked with overseeing a portfolio of non-standardized properties. Incomplete and inadequately communicated information regarding the physical characteristics of properties will often result in incorrect diagnosis of problems, requiring preventative human proofreading in the best cases and expensive remediation of misapplied solutions in the worst. Even when the relevant facts are known, smaller sample sizes resulting from heterogeneous properties and components mean that model training will take longer and cost more.

To be clear, Grobot itself has its fair share of challenges. One of the biggest is that it requires significant time investment from senior team members. This includes the initial environment setup, integration with our other software systems, and ongoing output tweaks that we expect to taper off over time. Its non-deterministic responses, characteristic of most LLMs, are sometimes more of a bug than a feature; explicitly asking it to return a JSON block works 99% of the time, but sometimes results in a plaintext paragraph. It fluently and accurately answers substantive questions about its knowledge base, but has trouble with requests to reproduce the content of specific files (a task that might be easier for traditional deterministic programs). And while it can solve a wide range of problems in information space, the most expensive tasks in property management are still in meatspace, beyond its reach.

With that said, some elements of Grobot’s functionality described above should be generalizable across other portfolios and asset classes. Text is text, and the conditions of Groma’s ecosystem are not so unusual that tasks like “summarize a block of emailed or transcribed text, categorize it, and route it appropriately” could not be applied by other operators. Depending on their level of software development expertise, the amount of friction in this integration will vary and therefore impact the magnitude of time/cost savings, but the upfront cost of implementation will still likely be outweighed by these efficiency gains in sufficiently large portfolios.

More broadly, there remain open questions of consumer sentiment towards more involvement of AI-operated entities in real estate and elsewhere. No one enjoyed phone trees for customer service, but it did help reduce costs. AI demonstrably reduces costs, but will its performance be good enough to reach parity or better with human operators? Or will consumers reject it, and will renters pay a meaningful premium for guaranteed human touch? At Groma, we are making the bet that Grobot can do most things cheaper, better, and faster, than humans can, lowering costs while improving quality. Some things, of course, will always require a human touch, from a smaller, more senior team.

Over time, improvements to the fundamental capabilities of AI may enable frontier models to brute force their way through these problems with sheer intelligence. In the meantime, operators should apply an extra layer of skepticism to AI-enabled proptech developed independently of the properties it is intended to manage.

—Chris Lehman, Seth Priebatsch, and Jason Urton

We’re currently at one of each per 96 units.

E.g. zoning and other land use restrictions that prevent or make it more difficult for developers to create new housing supply in response to high demand.

The Groma Blog

Discussion about this post