Machine Learning | ACA Death Spiral

One of the touted benefits of the Affordable Care Act was that, by fostering transparency, there would be greater competition in the health insurance market and that premiums would go down as a result. We now have data to help see whether competition within the various Exchanges has succeeded in reducing prices. This post, based on a scholarly talk I recently gave at the University of San Diego’s Workshop on Computation, Mathematics and Law, will suggest that the effect, if there is one, is small and subtle. It looks as if having just one seller of a product within a county may lead to somewhat higher prices, but the effect may not be robust. The methodology used here is a first cut. Whether other methodologies might tease out a larger relationship remains to be seen.

Note to ACA Death Spiral Fans: The USD conference mentioned above is one reason for the infrequent posts as of late. It’s been a busy period. Sorry. There’s A LOT to write about. Keeping track of Obamacare is at least a full time job.

Data

The data for this project comes mostly from good old healthcare.gov, which, if one forages around a bit, actually contains a user-friendly database exportable in various standard formats such as CSV and JSON describing all 78,392 plans currently being sold in 2,512 counties via the federal Exchange. Each plan is described by 128 fields, including the metal tier of the plan, the name of the issuer of the plan, the type of plan (PPO, HMO, POS, EPO), the monthly gross premiums of the standard plan for various family types, the deductibles and cost sharing arrangements of the standard plan, and the deductibles and cost sharing arrangements of the variants of the plan that feature cost sharing reductions as described in 42 U.S.C. § 18071. The remaining data comes from the United States census.

Methodology

The idea here is to consider each county of the United States as a market for health insurance and to find, for each county, the number of issuers selling plans on the Exchange, a representative measure of the price being charged by each issuer, and, therefore, a representative measure of the price charged within each county. If competition resulted in lower prices, one would expect to see — all other things being equal, which of course they are not — an inverse relationship between the number of issuers and the representative price charged within each county. We can also see, however, whether any such correlation is either spurious as a result of factors that correlate with both the number of issuers and the premiums charged or whether a stronger correlation might appear if other factors were controlled for. Here, the one other factor I took account of was county population density, the idea being that insurers might be less eager to enter counties in which the population density was low and that prices might be higher in such areas due to transportation costs.

Visualizing the Results

The “Distribution Chart” below shows a typical result from this data exploration. Here is the distribution of representative monthly premiums charged a couple in which the members are both 40 years old for a Silver PPO plan. The plot is broken down by the number of issuers within the count. If the insurer sells more than one Silver PPO plan within a county — which sometimes occurs — I take the median price for that insurer. And to determine the county price, I take the median price for all of the issuers.

The Distribution Chart works by using a dot to represent each gross monthly premium broken down by number of issuers. It applies different background colors that depend on the number of issuers within the county and shades each part of the background according to the density of premiums at that price level. Darker shades represent higher density.

We can run the same analysis for different purchasers, different metal levels, different types of plans and using different measures to move from issuer prices within a county to a single representative issuer price and to move from representative issuer prices to a representative county price. Here, for example, is the Distribution Chart for gold PPO plans purchased by couples age 40 with two children in which I use the minimum price offered by the issuer within each county and then use the 25th percentile price of those minimum prices to come up with a representative county price.

Distribution Chart for Gold PPOs (Coupled +2 children, Age 40), minimum by issuer, 25th quantile to derive county price

We can also aggregate matters. Here is the Distribution Chart for all Bronze plans of all types (HMO, PPO, POS, EPO) in which I take the median of multiple plans issued by a single issuer and then take the median value of all issuers to derive a county price. I do this for a single adult, age 30.

Here’s an analysis examining all types of Bronze plans but using a variant of the visualization. The individual dots are suppressed and we now have little histograms for situations in which there is 1 issuer through 8 issuers.

Histogram density visualization of all bronze plans

Eyeball Analysis

When I eyeball this data and many more permutations that I have produced, I at least do not see any dramatic and widespread relationship between the number of issuers within a county and the representative gross premium being charged. For some combination of parameters, one occasionally sees higher prices when there is only one issuer in the county, but generally the picture, at least the naked eye is quite blurry. The one thing I can say with some certainty is that the family-type of the purchaser — individual, couple, family with children — does not appear to affect matters. Premiums appear quite uniformly scaled across these groups.

What I do consistently is, as noted here and here, that there are many counties in which there is only one issuer of a particular level and type of plan. For Silver PPO plans, for example, in which one wants a medium level of cost sharing but wants at least some freedom in selecting a provider, of the 2,512 counties, 20% of the counties have no issuers with such a plan while another 36.6% have only one such issuer. Only 13% of the counties have three or more issuers of these plans. The pie chart below shows the distribution of issuers.

Or, suppose one simply wants a bronze plan of any sort. What we see is that 16.2% of the counties apparently have no such plan, 27.9% have only one issuer and 31% have 2. Thus, only about one third of the counties have 3 or more choices for a simple bronze plan. The pie chart below shows the result.

Statistical Analysis

Sometimes the human eye and the human brain, magnificent as those organs are, do not see patterns that in fact emerge when studied through the lens of statistics or machine learning. Modern computers and statistical activities make it easy to go beyond eyeballing data. What I have done, therefore is to merge representative premium data with data on the population density of each county and see if any statistically significant relationship emerges between the number of issuers within each county and the county representative price.

I want to start with the simplest model: a linear relationship between the number of issuers and the county representative premium. I will do the analysis at first for my baseline Silver PPO purchased by a couple age 40 where I use the median price of the issuer if they sell more than one Silver PPO within the county and the median price of issuers . The graphic below shows the results. There is a statistically significant relationship between the number of issuers and the premium. For each additional issuer, the gross premium goes down by about $16. The model overall, however, accounts for only 2.1% of the variation in representative county prices, meaning, roughly speaking, that 98% of the variation in premiums is correlated with factors other than the number of issuers.

Linear regression of county representative price on number of issuers

The problem with leaping from this finding to an attempted vindication of claims about the virtues of the ACA is that the result, even weak as it is, depends a bit on specification of the model. This gets a little technical, but unless one assumes a priori that there is some good reason to think that the relationship between number of issuers and price is in fact a linear one, restricting the regression to a simple linear model is potentially misleading. Here, for example, I regress the same data on n (the number of issuers), n-squared and the log of n. All of the coefficients in front of the various terms are still significant, but if one looks at the picture one gets a much more complex story. It appears that having one issuer does lead to high prices and that having two issuers may minimize the number of prices. As one increases the number of prices beyond two prices go up again until we peak at four issuers. This model explains almost 9% of the variance in pricing, which is considerably better than the simplest linear model but still not very good. Clearly, pricing is determined by much more than the number of issuers within a county.

Pricing model based on linear, quadratic and logarithmic term

The observed pattern when this more complex regression model is used appears roughly to persist for all metal types of HMOs and PPOs except platinum PPOs where we see the price increase as the number of issuers within a county increases. The family type of the purchaser appears not to affect the general shape of the relationship. I am never able to explain more than about 12% of the variance in premium pricing when I use just the number of issuers within the county as my single explanatory variable.

I have some sense that the population density of a county might have an effect on pricing. Perhaps lower density counties are more expensive. Or, it could be the case that higher density counties, which may have fancier equipment, are more expensive. The regression below shows a simple linear regression using two variables: number of issuers within the county and population density of the county. As one can see, the results are little changed. Both variables have effects that are statistically significant but small. As one goes from 1 to 2 issuers, the price drops by about $17 per month. As one goes from a county in which the population density is 4.3 (which would put it in the 10th percentile) to a county in which the population density is 491 (which would put it in the 90th percentile), the price goes up by $7 per month. The model still does not explain much (adjusted R-squared <0.03). Here are the results in more detail.

Linear regression using number of issuers in county and population density

Again, I can use a more complex specification. Below I show the results of using linear, quadratic and logarithmic terms for both number of issuers and population density. What we see is a complex picture in which having just one issuer appears to persist in causing somewhat higher prices and in which population density plays a small role. But we are still able to explain less than 10% in the variation of premiums. Again, whatever is going on in premium pricing models, is a lot more complex.

Linear, quadratic and logarithmic terms for number of issuers and population density

A Foray into Machine Learning

I also attempted to see whether a computer could find a formula that predicted county representative gross premiums any better than my statistical models when given free rein to do so. To do this, I loaded the data into a program called Eureqa from Nutonian .com, which basically uses “genetic programming” to find models that predict well. The basic idea is to treat mathematical formulae kind of like strands of genetic material and permit mathematical formulae that perform better to evolve via mutation and “sex” to produce what may be yet formulae. Sometimes it produces amazing results and — well — sometimes it does not. Either way, however, genetic programming and other methods of machine learning are a useful complement to traditional techniques. They help one check whether the apparent incapacity of traditional methods such as regression are an artifact of limited specifications or the result of unavoidable noise in the data.

In this case, Eureka basically found little. It found some functional forms a human might not come up with such as the one below, which appeared to predict decently, but in fact did not do any better than the models I developed by hand. The foray into machine learning suggests, then, that the limited ability of our our statistical models to predict well is not the result of a failure to specify the model correctly but rather the result of noise in the data and unobserved variables.

Thoughts

Unfortunately, perhaps, the results shown here are not the sort one writes home about or that get on the front page of either scholarly publications or news reports. They are kind of “meh” results. Maybe market concentration has an effect, but, at least as revealed by the data here it is small. So, why might this be?

1. Perhaps the number of insurers in the Exchange is not as relevant anymore as might be thought. Given the availability of individual policies off the Exchange in some states, the number of individual polices within the Exchange may not be as important. I don’t have the data on off-Exchange policies and neither, so far as I know, does anyone else.

2. Maybe pricing is determined more by the identity of the insurer than the number of insurers. Suppose, for example — and I do not say this is true — that Blue Cross made different assumptions about adverse selection and moral hazard with the purchasing population than did, say, United Healthcare. Markets that Blue Cross entered aggressively might thus have lower representative county prices than markets in which they did not. Or suppose that Blue Cross was able to use market power and/or superior skill to create narrower networks that nonetheless satisfied regulators. This might account for markets in which Blue Cross was present exhibiting lower prices. Or suppose that Humana was more willing to take a loss the first year in order to supposedly lock in business than was Blue Cross. This too might explain lower pricing. This suggests another experiment in which one looks at pricing as a contest and seeing how each of the competitors fared against each other.

3. Maybe consumers are very sophisticated such that “Silver PPO plans” are not comparable. If consumers, for example, value the precise package of benefits and providers offered by, say, Blue Cross in a county as being quite different from the precise package of benefits and providers offered by, say, Humana, then we can’t just count issuers in determining the level of competition in a county.

4. Population density isn’t the right variable to include. Maybe what we need is some measure of medical pricing by counties. Or maybe, as the Wall Street Journal suggested, we need to include some measure of income or income inequality. Sadly, it may be that healthcare costs more in poorer counties, perhaps because the poor have more serious health problems. At the moment I have not included those variables. Future examinations of this area should probably do so insofar as the data permits.

Note

Ordinarily, it would be my practice to make the Mathematica notebooks used to conduct this analysis fully available. I very much believe in transparency. Unfortunately, this analysis was conducted using features in a beta version of Mathematica 10 and I have signed a non-disclosure agreement with respect to that software. While I received consent to show certain results from use of that software, I did not request or receive consent to show code. Moreover, the code would not work on computers that do not have Mathematica 10. I commit to releasing the code as soon as Mathematica 10 is out of beta. I don’t think my NDA stops me from saying, however, that Mathematica 10 looks somewhere between absolutely spectacular and completely mind-blowing.

ACA Death Spiral

Tag Archives: Machine Learning

Does competition in the Exchanges result in lower premiums?