Short Price: Analyzing the price determinants of Airbnb listings in NYC.


Airbnb started in 2008 as an online marketplace for arranging or offering lodging, primarily holiday accommodation, or tourism experiences. According to their website (, the company does not own any of the real estate listings on its website. It merely acts as a broker that helps people connect with homeowners who are willing to offer their home for short term accommodation in exchange for a fee. The commissions received from the brokerage is their main source of revenue.

Since its founding 11 years ago, Airbnb has successfully disrupted the hospitality industry. Offering more than 5 million properties in over 85,000 cities across the world, with market valuation exceeding $30 billion. New York City is the company’s biggest market, with over 52,000 listings as of November 2018.

This paper seeks to investigate the strength of the relationship between several factors affecting the listing price on Airbnb. In particular it’s an attempt to examine how activities by the host and other customer activity metrics (reviews) affects price, i.e. the relationship between listing activity and customer activity with the attendant effect on price.

Statement of the research

This paper will attempt to explore the price determinant relationships for Airbnb listings in New York City (the company’s biggest market), with the hope of providing insights into an example how suppliers on an online marketplace come to a pricing decision. Recent reports have suggested that most suppliers (homeowners) simply look at what their neighbors were charging, and then try to match the local rate.

This paper will analyze Airbnb own’s data to investigate key factors that are price determinants on listings. Even though the company officially informs suppliers that the price they charge is entirely up to them, other reports have suggested that there is a pricing algorithm behind the final figure posted for all listings.

With the platform coming under scrutiny for raising rental costs in cities, it’s imperative for us to understand the price determinant relationships for a typical Airbnb listing, most importantly in a metropolis like New York City. Price plays an important role in the operations and impact of online marketplaces such as Airbnb, particularly because the rising rental costs in host cities could also be described as the multiplier effect of the prices set by Airbnb suppliers — who themselves are locals.

Research methodology

To examine the proposed research question in this paper, a multiple linear regression will be conducted to assess if the explanatory variables predict the dependent variable and an explanation of the relationship between them.

The method of multiple linear regression model and its estimation using ordinary least squares (OLS) is employed for the purpose of this study, it is doubtless the most widely used tool in econometrics. It allows us to estimate the relation between a dependent variable and a set of explanatory variables.

The functional form of the relationship between the dependent and independent variables is given below:

y = β1 X1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 + β6X6 + β7X7 + β8X8 + β9X9 + U

y = Price
β1 = regression coefficient 1, X1 = total number of reviews — to indicate the property’s popularity with Airbnb’s pricing algorithm
β2 = regression coefficient 2, X2 = accommodates — number of guests a property can accommodate
β3 = regression coefficient 3, X3 = bedrooms — number of bedrooms
β4 = regression coefficient 4, X4 = overall satisfaction or rating received
β5 = regression coefficient 5, X5 = Brooklyn: 1 or 0
β6 = regression coefficient 6, X6 = Manhattan: 1 or 0
β7 = regression coefficient 7, X7 = Queens: 1 or 0
β8 = regression coefficient 8, X8 = Staten Island: 1 or 0
β9 = regression coefficient 9, X9 = Bronx: 1 or 0
U = error term

The Bronx variable was encoded as the base variable for all the other variables representing the five boroughs of New York. This was done to avoid the dummy variable trap.

All independent variables were simultaneously entered into the model. Variables are evaluated based on the relationship and what is add to the prediction of the dependent variable in the model. We pick the most used variables (with available data) across the previous studies presented below:

Price determinants used in previous studies

The hypotheses to be tested for under this research analysis are:
H0: β1 = 0 or total number of reviews is not a significant price determinant.

H1: β1 ≠ 0 or total number of reviews is a significant price determinant.

It is very important to note that this model is under threat from other factors not captured here, as represented by the U term. The variables selected were picked based on what is considered key price determinants.

Literature review

The growth of Airbnb has continued to attract substantial attention, but most scholarly research has focused on Airbnb’s competitiveness to its counterpart, hotel industries, and its challenges. A number of studies have been conducted to understand the key price determinants on Airbnb.

Using a dataset of 5,779 Airbnb listings managed by 4,602 hosts in 41 census tracts of Austin, Texas, Chen, et al (2017) developed a price model to test the effects of a group of utility-based attributes on the price of Airbnb listings. They observed characteristics of the listings, attributes of hosts, reputation of listings, and market competition. They examined attributes as they relate to price, and therefore estimated consumers’ willingness to pay for the specific attributes on a listing. They found that the functional characteristics of Airbnb listings were significantly associated to the price of the listings and also found that three out of five behavioral attributes of hosts were statistically significant. However, the effect of reputation of listings on the price of Airbnb listings was weak.

Zhang et al. (2017) investigated the key factors affecting Airbnb listing prices based on a sample of 974 Airbnb listings in Metro Nashville using both the General Linear Model and the Geographically Weighted Regression model. Their GLM analysis suggested that the distance to the local convention center, the number of reviews and the review rating scores are significantly connected with the Airbnb listing price. Their study concluded that the number of reviews and the rating score negatively affect the Airbnb listing price in Metro Nashville, but also noted that Airbnb in the area is in an early stage of development at the time of their research.

Studying the impact of increased Airbnb listings on rental price, Barron et al. (2018) find that a 1% increase in Airbnb listings leads to a 0.018% increase in rents and a 0.026% increase in house prices. Using Airbnb listings information from 33 cities, Ikkala and Lampinen (2014) reported that the reputation of Airbnb listings (reviews) is correlated with their price because reputation has directeconomic consequences for transactions on Airbnb: ‘reputational capital’ or ratings can be diverted into the price of the host’s property.

Wang and Nicolau (2017) identified five price-determinant categories of Airbnb listings, including host attributes, site and property attributes, amenities and services, rental rules, and online review ratings.

Gutt et al. (2017) reported that a high review can be converted to profit in the hospitality industry, and rating visibility leads to an increase in price.
This study will proceed from the major price determinants identified by existing literature and will serve to extend current literature on Airbnb’s dynamic pricing approach and also contribute to the body of knowledge in helping us to conduct informed cost-benefit analysis of the platform.

Data source

The primary data for this research is a cross-sectional dataset of Airbnb listing activity and metrics in NYC, NY for 2018. It was sourced from Kaggle under Airbnb Open Data initiative. There are 41245 unique properties listed in this dataset.

Empirical Results

Descriptive summary of the variables:


Correlation :

Correlation analysis

Initial correlation analysis reveals that there is a negative correlation between the number of bedrooms and Manhattan which is the most populated borough in NYC. It has a negative correlation with the number of people a typical property can accommodate as well. None of the variables have unusual correlation.

Distrinution of listings in the five boroughs:

Listing type by borough

Manhattan is the heart of the Airbnb gig economy.

OLS regression result:

OLS regression result

The estimated equation is:

Price^ = –17.82 — 0.27 reviews — 4.52 satisfaction + 34.69 accommodates + 17.8 bedrooms + 25.14 Brooklyn + 82.02 Manhattan + 4.61 Queens + 10.11 Staten Island
R2: 0.230, n = 41245

The negative sign on the intercept reflects the fact that if customers looking at a typical property are 0 then the property owners are likely to lower their prices by $17.82. Also, the negative sign on reviews is surprising but it is also consistent with the findings in Zhang et al. (2017), but it could also mean that property owners who have a lot of reviews (customers) are inclined to price their positively reviewed property below the market price so as to guarantee a supplier of renters. It is also statistically highly significant at the 5% level.

Overall satisfaction rating is also surprisingly negative, but this could be interpreted to mean that since NYC is one of the most expensive cities in the world, then customers are more likely to give a good rating to lower priced properties. This variable is statistically significant at the 5% level as well.

Properties with bigger accommodation for parties and private events attract high prices and its evident by the positive and large sign on accommodates. Bedrooms has a positive sign and it is very statistically significant at the 5% level of significance. Homes with more bedrooms can command higher prices.

The variables for the New York Boroughs all have positive sign and the very large value on Manhattan reflects that properties in that part of New York are very highly prices. Two of the neighborhood variables — Queens and Staten Island are statistically insignificant at the 5% level. These two places are the least populated boroughs in New York. The given R2 of 0.23 tells us that the model is only able to explain 23% of the factors that determines our dependent variable.

A joint test of significance reveals that all the independent variables are jointly significant. As such we can conclude that the selected variables are relevant to this study.

Join test of significance

Also, a test of heteroscedasticity reveals Standard Errors that are not widely different from what we have in the OLS Standard Errors. Because of this we can conclude that homoscedasticity is present and the OLS estimate is BLUE.

Test of heteroscedasticity

Hypotheses testing:

H0: β1 = 0 or total number of reviews is not a significant price determinant. H1: β1 ≠ 0 or total number of reviews is a significant price determinant.

The t value |-10.877| of the hypotheses test is greater than the p value and because of that we can reject the null hypothesis that total number of reviews is not a significant price determinant. This is consistent with past studies.


This study was conducted to investigate the direction in the relationship among the key variables that are identified as price determinants for Airbnb price listings in a mega city like NYC. The results from the OLS regression explained the magnitude of the determinants used in this study. Even though most of the factors affecting the dependent variable have not been fully investigated, this study presented present consistent estimators that can be used in explaining some of the factors influencing price listings in NYC.

For example, Manhattan properties are bound to command significantly higher prices and this study also pressed home the fact that hosts will lover prices in situations where they have no renters available and are forced to drive traffic with lower prices — the negative sign on the intercept validates this.

The result presented here holds significant value for theoretical and empirical research of this nature in the future.

In terms of the areas for future research, the low R-squared value of 0.23 tells us that there are other factors affecting how listing price is determined for a property that are not captured in this model. The availability of data on what hosts charge for cleaning would be useful in future research, as well as the data on the logic behind Airbnb’s price suggestion algorithm.

Extension of this research could potentially focus on how social-economic activities like big city events causes price hike and on how the value of amenities like Wi-Fi makes a listing in a commercial nerve center like New York more likely to be snapped up, whereas Summer listings will see many fewer bookings if they don’t offer air conditioning.

As city officials in places like San Francisco begin to talk about the idea of placing a cap on the number of listings a property can have in a year, the importance of undertaking a study like this one becomes even more pronounced.

Reference:, What is Airbnb and how does it work?

Barron, Edward Kung, Davide Proserpio, Research: When Airbnb listings in a city increase, so do rent prices. in-a-city-increase-so-do-rent-prices . Harvard Business Review. April 2019.

Barron, Kyle and Kung, Edward and Proserpio, Davide, The Effect of Home- Sharing on House Prices and Rents: Evidence from Airbnb (March 29, 2018). Available at
SSRN: or 2

Chen, Yong & Xie, Karen (2017). Consumer valuation of Airbnb listings: A hedonic price approach. International Journal of Contemporary Hospitality Management, 29(9). The Special Issue of Sharing Economy. Forthcoming

Ikkala, Tapio & Lampinen, Airi. (2014). Defining the price of hospitality: Networked hospitality exchange via Airbnb. Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW. 173–176. 10.1145/2556420.2556506.

J.M Wooldridge, Introductory Econometrics: A Modern Approach 5th Edition, Cengage Learning.