If You Build It – Will They Come?

“If you build it – they will come” is a quote often used satirically to deride investors in white elephant projects. However, logically, the most critical determinant in measuring tourist flows is the number of beds available. We can generalize this to tourist infrastructure in general, which is what I will be examining here. Can a destination be built to attract tourists?

I can’t answer this question directly with the data at my disposal. However, I hope to gain some insights that I can use to rank destinations by their available infrastructure. If a destination has more tourist infrastructure than my model predicts, then the travel experience should be, on average, better. The model needs to account for various types of infrastructure and how burdened the location already is with tourist traffic.

I divide tourist infrastructure into three general categories, places of accommodation, transportation, and tourist support. Additionally, we could talk about the availability of restaurants and the price of airline tickets. However, the most significant problem with this classification is measuring only the relevant infrastructure.

We can measure the availability of overnight accommodation by simply counting hotels. Likewise, measuring the length of passenger railroad tracks in a given region is straightforward, though potentially misleading. Since cities have underground or light rail lines, these are not included in official statistics, and I could not find a consistent way of including them.

Tourist support refers to the less tangible infrastructure available to them. For example, how easy is it for a lost tourist to find their way or for a spontaneous visitor to discover what to see or do? Measuring this is impossible, though I can imagine various proxy measures. The best measure would be marketing or tourist budgets for different administrative regions, but this data is not available to the public.

My proxy variable is the length of the English Wikipedia article to capture the rough amount of information available. However, it is only collected per city since nobody is likely to search for the name of a German Landkreis.

Since the data is cross-sectional, other variables, such as flight costs or airport availability, are more complex to map to specific geographical locales. The number of restaurants or other amenities may reflect local taste and wealth rather than tourist spending, so I don’t include these numbers.

You can see my review here for an overview of the other variables.

Social Media and Over-Tourism

Thus far, I have used overnight stays per capita as the variable of interest in these tourism statistics. This variable presents a problem, as hotels may be built in response to tourist demand, thus creating an endogenous relationship with the number of hotels per capita value on the right-hand side. Here we are interested in expanding the model to focus on available tourist infrastructure. It makes sense to use overnight stays per hotel and focus on other infrastructure terms. 

The following two graphs show the relationship between Instagram posts per city tag and the two statistics for overnight stays. The relationship relating to the per capita statistic is very linear, and cities of all sizes sit on either side of the regression average. By contrast, higher Instagram posts per capita are associated with a higher burden on available infrastructure when we look at the relative tourist load on a city. This effect is also most substantial for larger cities.

The bubble size relates a city to its population.

The code for all of the graphs can be seen in my GitHub repository. I tried to optimize them as much as possible for mobile view, but some will be better seen in desktop mode.

The bubble size relates a city to the number of hotels that it has.

The implication is that as social media engagement increases for a city, it becomes a self-fulfilling prophecy attracting more tourists than we would otherwise expect. The challenge now will be to expand this model to the rest of the country, where we lack meaningful statistics for social media posts.

Momentum and Over-Tourism

To try and proxy social media hype, I took the one, and five-year change in tourist overnight stays before 2019, which is the year we focus our analysis here. This allows me to extend this infrastructure analysis to the rest of Germany’s administrative counties. The following graphs demonstrate the relationships between short and long-term momentum on both per capita and per hotel overnight stays.

The differences between the per capita and per hotel statistics reflect similar trends from city-only statistics. The reason is much simpler, though: it’s easier to scale up tourist flows than build new hotels. So, we expect destinations with the rapid growth of their tourist market to have their infrastructure utilized more than expected. The heteroskedasticity displayed by the larger cities is an interesting note. It seems that scale exerts a separate impact from simple tourist momentum.

Overnights hotel is also significantly correlated with the other available statistics measuring tourist infrastructure. The relationship to railroad mileage is remarkably different from the per capita statistic. This is a statistical glitch because dense urban centers do not have railroads measured in official statistics.

Modeling Tourist Infrastructure

When we look at the first of the simple OLS models using the city dataset, we see overall that the model has less explanatory power than desired, with only 50% of the variance captured by our variable space. Notably, though, the length of the Wikipedia article is a powerful predictor for overnight stays, even after controlling for size and GDP. Likewise, this holds for foreign tourists as well, though, for foreigners, momentum is a much more important driver for overnight stays.

The differences in demand drivers for domestic and foreign tourists are evident here. While domestic tourism numbers are difficult to disentangle from non-tourist travel (the significance of the coast for domestic tourists represents leisure travel rather than tourism), we can see that they are less driven by hype than by foreign tourists.

The evidence suggests that foreign tourists prefer well-known destinations that have been hyped up in the media. In other words, for foreign tourists, the conspicuous consumption of tourism goods is driven by the need for a high social return from their audience. High social return requires that the destination be easily recognizable and deemed desirable from an Instagram post and that the destination be reachable without excessive effort. Desireabily appears to be driven less by intrinsic characteristics of a city, e.g. historical monuments, and more by general familiarity and association. From this simple model, though, I can’t draw any real conclusions about the nature of this relationship.

Log Overnights p.H. and Infrastructure
Y Var:OvernightsR-Squared:0.555
Model:OLSAdj. R-Squared:0.514
No. Obs:95Covar. Type:HC3
X VarCoefStd ErrtP>|t|
Constant8.19831.2316.6600.000
Momentum 1Yr0.05960.4610.1290.897
Momentum 5Yr0.41760.2631.5870.116
Area Km2**-0.00120.000-2.8300.006
Log GDP p.C.-0.15490.097-1.6040.112
Monuments0.00640.0090.7550.452
Capital0.13610.1081.2650.209
Coast*0.18010.0802.2610.026
Wikipedia Len.***0.31100.0575.4790.000
Log Foreign Overnights p.H. and Infrastructure
Y Var:OvernightsR-Squared:0.553
Model:OLSAdj. R-Squared:0.512
No. Obs:95Covar. Type:HC3
X VarCoefStd ErrtP>|t|
Constant1.68922.1370.7900.432
Momentum 1Yr*-2.02311.004-2.0140.047
Momentum 5Yr**1.08250.3423.1680.002
Area Km2*-0.00160.001-2.0080.048
Log GDP p.C.0.17150.1730.9940.323
Monuments0.01550.0101.5000.137
Capital0.04990.1950.2560.798
Coast-0.10260.440-0.2330.816
Wikipedia Len.***0.43190.0944.5790.000

Now when I expand the model to include all German counties, we see that importance of momentum for foreign tourists drops significantly. Instead, we see GDP per capita, historical monuments, buildings, and urban density as the primary drivers. The implication is that foreign tourists who leave the cities are driven by different factors than those who do not.
The most curious implication here is that tourist flows actually decrease as the number of old buildings increases for a given population density. This holds even while the number of historical monuments positively correlates with overnight stays.

So even if foreign tourists visiting local towns and the countryside are less driven by social media hype, their overnight stays are associated with specific destinations. This conclusion is obvious, but the model hints at the potential of hidden gems for special interest travelers. For those willing to forgo urban environments with a high density of well-known historical monuments, the travel experience measured by overnight stays per hotel should improve.

Log Overnights p.H. and Infrastructure
Y Var:OvernightsR-Squared:0.423
Model:OLSAdj. R-Squared:0.407
No. Obs:384Covar. Type:HC3
X VarCoefStd ErrtP>|t|
Constant6.66150.9636.9200.000
Momentum 1Yr0.49670.5060.9820.327
Momentum 5Yr*0.36750.1652.2240.027
Log Railroad p.Km2-0.01400.058-0.2410.810
Log GDP p.C.**0.21460.0762.8420.005
Monuments***0.01780.0044.4820.000
Old Buildings p.C.0.0000.000-0.2810.779
Urban Area (%)***0.00790.0023.8430.000
Coast***0.71160.1066.6900.000
Vineyard Area (%)*0.0000.000-1.7470.082
Airports0.10320.0661.5540.121
Log Foreign Overnights p.H. and Infrastructure
Y Var:OvernightsR-Squared:0.579
Model:OLSAdj. R-Squared:0.567
No. Obs:384Covar. Type:HC3
X VarCoefStd ErrtP>|t|
Constant1.58331.7920.8840.377
Momentum 1Yr0.10820.8180.1320.895
Momentum 5Yr*0.67950.3102.1930.029
Log Railroad p.Km20.13050.1011.2950.196
Log GDP p.C.***0.64400.1304.9730.000
Monuments**0.01990.0063.2440.001
Old Buildings p.C.***-0.00120.000-6.6550.000
Urban Area (%)**0.01100.0033.5010.001
Coast-0.12230.212-0.5780.564
Vineyard Area (%)0.0000.0000.4610.645
Airports*0.28470.1102.5930.010

Conclusion

If you build hotels, will the tourists come? Maybe, but you first must develop a few historical monuments and launch a multi-year social media hype campaign. Domestic tourism is more intrinsically driven, as locals spend more time and less money. Foreign tourism is driven by famous big cities and ease of travel, as they have less time. Getting foreign tourists to spend more money to explore beyond the big cities is more challenging. Well-known historical monuments and a major airport nearby seem to be essential factors.

More generally, my next step will be to use this model to uncover some under and overrated destinations in Germany. One goal of this site is to give travel recommendations that enhance the travel experience. This will hopefully be quite interesting for the special-interest traveler looking for cultural or historical immersion.

Articles in this Series

Ranking the Regions

Ranking the Regions

The website uses a simple ranking methodology to help categorize travel destinations into various categories. People travel for different reasons and have different expectations. Some travelers do so with a…

Counting the Hidden Gems

Counting the Hidden Gems

The final step in this model-building process is ranking the German counties and the aggregation into my geographic schema. The goal is to build a metric that might help me…

Do Tourists like Nature?

Do Tourists like Nature?

Do tourists care about national parks? I think the answer is obviously yes, as the global success of the American National Park system suggests. The draw for travelers is obvious…

Counting Tourists

Counting Tourists

The holy grail of quantitative tourism would be a near-objective measure of “too many” tourists. Such information would allow airlines, tour providers, and municipalities to direct and redirect tourists to…

Is Anything Authentic Anymore?

Is Anything Authentic Anymore?

Have you ever been to a tourist trap or a location with so many people that the entire trip felt pointless or disappointing? Maybe you walked into a local shop…

One thought on “If You Build It – Will They Come?

  1. Hi there, I just read your blog post and I found it to be a very interesting analysis of the relationship between tourist infrastructure and the number of overnight stays. I particularly liked the way you divided tourist infrastructure into three general categories and used different variables to measure each one. I also found it intriguing how you used Instagram posts per city tag to show the relationship between social media engagement and the burden on available infrastructure. Overall, your analysis provides valuable insights on how destinations can be built to attract tourists, and it’s definitely something I’ll keep in mind when planning my next trip. Keep up the great work!

Leave a Reply