One of the ideas behind employee group policies and one of the purported virtues of employer sponsored health insurance is that the marketing and other overhead costs are lower. Instead of selling 200 policies to 200 persons, the insurer sells one policy to an employer. And one reason employer sponsored health insurance has often been supposed to be lower priced than comparable individual policies is that individuals who are gainfully employed tend not to have at least some of the expensive chronic or acute conditions such as certain forms of cancer or serious heart disease.
These assumptions predict that if one looks at policies sold in the individual exchange and the SHOP exchange under the Affordable Care Act that are sold in the same county, from the same issuer, have the same marketing name, have the same metal level, the same plan type (PPO, HMO, etc.), the same deductible and the same out-of-pocket limit, the SHOP premiums should on balance be lower than the individual premiums. Apples to apples, they should at least be no higher. Indeed, lower potential premiums are one of the key reasons for the existence of SHOP exchanges.
When I examine the data available from healthcare.gov, however, I find that the opposite is true.
SHOP policies are on average 8% more expensive than apparently identical individual policies.
That’s the key finding. The rest of this post tries to figure out why this might be the case. I’ll tell you the statistical ground I traversed in this effort. But I will confess that at the end of the day I am not quite sure why it is that SHOP premiums sure appear to be higher.
It’s not a matter of outliers skewing the mean. The median premium ratio between comparable SHOP and individual ratios — the “SHOP/individual ratio” — is 1.08. 10% of the comparable policies are more than 21% more expensive on the SHOP exchange. And even the lowest 10% are just 2% cheaper on the SHOP exchange than on the individual exchange.
The figure below shows the distribution of SHOP/individual ratios among the 12,689 comparable policies in the dataset. As can readily be seen, most policies are more expensive on the SHOP exchange. (A SHOP/individual ratio greater than 1 means that the median SHOP policy was more expensive than the median comparable individual policy).
Breaking down the data to gain insight
The state in which the policy was sold and the issuer might affect the SHOP/individual ratio. Unfortunately the effects of these potentially separate variables are extremely to tease apart because no issuer sold in more than one state. The table to the left shows the results. It shows significant variation in median SHOP/individual ratios across the combination of issuer and state. On the top end, Minuteman Health, which sells in New Hampshire has a ratio of 1.31 and Health Alliance Medical Plans, which sells in Illinois has a ratio of 1.25 Montana Health CO-OP has a ratio of 1.2. On the bottom end, CommunityCare, which sells in Oklahoma has a ratio of 0.76 and CoOpportunity Care, which sells in Iowa, has a ratio of 0.86.
The best I can do to tease out whether it is the state in which the policy is being sold or the issuer in the state that is responsible for the variation is to look at those states in which all the issuers have SHOP/individual ratios greater than the median of 1.08 and none have a lower one. Two states show up: Montana and Texas. This suggests some regulatory issue or special market condition in those two states that is leading SHOP premiums to be unusually high or individual premiums to be unusually low. Unfortunately, the finding is not particularly robust and I will confess to having little idea what this special factor might be.
We can see if the metal level and the plan type end up affecting the SHOP/individual ratio once the issuer and state are controlled for. Multivariate linear regression shows the impact of Metal Level and Plan Type to be quite small. The greatest effect was shown by POS plans, which reduced the SHOP/individual ratio by 0.03. Everything else had a smaller effect. Basically, SHOP plans tend to be more expensive than comparable individual market plans without regard to metal level or plan type.
An adverse selection theory?
Let me at least explore one other reason SHOP premiums might generally be higher. There are several buffers against adverse selection — the proclivity of persons with accurate knowledge of higher risk to select greater amounts of insurance coverage — in the individual market. A key one of these are the premium subsidies, which mean that many individuals, even those of relatively low risk, pay less for insurance than their expected cost of health care. There’s the individual mandate that likewise coaxes individuals, even those of low relative risk, to purchase insurance. There’s an open enrollment period. An individual can’t easily wait until they get sick and then by insurance in the middle of the year. In theory, they only can purchase insurance outside of the open enrollment window if they qualify for “special enrollment” by virtue of a small set of non-medical changes in their life circumstances.
The SHOP market may be more vulnerable to adverse selection problems. There’s no employer mandate that applies to small employers that tend to be eligible for SHOP policies. Although there are some tax credit subsidies, they tend to be less lavish than those bestowed on individuals and they apply only to some small employers. For other small employers there really aren’t any of the sort of “special deals” that might make it a good deal for even those whose employees and dependents are low risk. Finally there is no limited open enrollment period. On the contrary, under 45 C.F.R. § 155.725(b), “[t]he SHOP must permit a qualified employer to purchase coverage for its small group at any point during the year. ” Thus, an employer can wait until someone they care about — say the CEO’s daughter — gets really sick and then decide it would be prudent to have a employee group health plan.
What we may be seeing in the SHOP exchanges is an insurance world in which even the mild protections against adverse selection contained on the individual exchanges do not exist. And the result is predictable: higher prices and very few buyers.
So, what do we do with this finding? What do we do about the fact that SHOP policies tend to priced higher than comparable individual ones? Standing by itself, I am not sure one can draw much of an affirmative policy implication out of the result, particularly where we don’t yet have a good handle on causation. I do think it argues for thinking about imposing some adverse selection controls — such as limited open enrollment periods — if we are going to keep SHOP alive.
I also think, however, that the findings disclosed here have kind of a rebuttal value. They mean that proponents of SHOP exchanges should have a difficult time grounding an argument to preserve that additional complexity of Obamacare on grounds that it saves money. Although there are, to be sure, exceptions, and although there may be other reasons to induce small employers to purchase health insurance for their employees, right now it does not look as if the price is right.
One of the touted benefits of the Affordable Care Act was that, by fostering transparency, there would be greater competition in the health insurance market and that premiums would go down as a result. We now have data to help see whether competition within the various Exchanges has succeeded in reducing prices. This post, based on a scholarly talk I recently gave at the University of San Diego’s Workshop on Computation, Mathematics and Law, will suggest that the effect, if there is one, is small and subtle. It looks as if having just one seller of a product within a county may lead to somewhat higher prices, but the effect may not be robust. The methodology used here is a first cut. Whether other methodologies might tease out a larger relationship remains to be seen.
Note to ACA Death Spiral Fans: The USD conference mentioned above is one reason for the infrequent posts as of late. It’s been a busy period. Sorry. There’s A LOT to write about. Keeping track of Obamacare is at least a full time job.
The data for this project comes mostly from good old healthcare.gov, which, if one forages around a bit, actually contains a user-friendly database exportable in various standard formats such as CSV and JSON describing all 78,392 plans currently being sold in 2,512 counties via the federal Exchange. Each plan is described by 128 fields, including the metal tier of the plan, the name of the issuer of the plan, the type of plan (PPO, HMO, POS, EPO), the monthly gross premiums of the standard plan for various family types, the deductibles and cost sharing arrangements of the standard plan, and the deductibles and cost sharing arrangements of the variants of the plan that feature cost sharing reductions as described in 42 U.S.C. § 18071. The remaining data comes from the United States census.
The idea here is to consider each county of the United States as a market for health insurance and to find, for each county, the number of issuers selling plans on the Exchange, a representative measure of the price being charged by each issuer, and, therefore, a representative measure of the price charged within each county. If competition resulted in lower prices, one would expect to see — all other things being equal, which of course they are not — an inverse relationship between the number of issuers and the representative price charged within each county. We can also see, however, whether any such correlation is either spurious as a result of factors that correlate with both the number of issuers and the premiums charged or whether a stronger correlation might appear if other factors were controlled for. Here, the one other factor I took account of was county population density, the idea being that insurers might be less eager to enter counties in which the population density was low and that prices might be higher in such areas due to transportation costs.
Visualizing the Results
The “Distribution Chart” below shows a typical result from this data exploration. Here is the distribution of representative monthly premiums charged a couple in which the members are both 40 years old for a Silver PPO plan. The plot is broken down by the number of issuers within the count. If the insurer sells more than one Silver PPO plan within a county — which sometimes occurs — I take the median price for that insurer. And to determine the county price, I take the median price for all of the issuers.
The Distribution Chart works by using a dot to represent each gross monthly premium broken down by number of issuers. It applies different background colors that depend on the number of issuers within the county and shades each part of the background according to the density of premiums at that price level. Darker shades represent higher density.
We can run the same analysis for different purchasers, different metal levels, different types of plans and using different measures to move from issuer prices within a county to a single representative issuer price and to move from representative issuer prices to a representative county price. Here, for example, is the Distribution Chart for gold PPO plans purchased by couples age 40 with two children in which I use the minimum price offered by the issuer within each county and then use the 25th percentile price of those minimum prices to come up with a representative county price.
We can also aggregate matters. Here is the Distribution Chart for all Bronze plans of all types (HMO, PPO, POS, EPO) in which I take the median of multiple plans issued by a single issuer and then take the median value of all issuers to derive a county price. I do this for a single adult, age 30.
Here’s an analysis examining all types of Bronze plans but using a variant of the visualization. The individual dots are suppressed and we now have little histograms for situations in which there is 1 issuer through 8 issuers.
When I eyeball this data and many more permutations that I have produced, I at least do not see any dramatic and widespread relationship between the number of issuers within a county and the representative gross premium being charged. For some combination of parameters, one occasionally sees higher prices when there is only one issuer in the county, but generally the picture, at least the naked eye is quite blurry. The one thing I can say with some certainty is that the family-type of the purchaser — individual, couple, family with children — does not appear to affect matters. Premiums appear quite uniformly scaled across these groups.
What I do consistently is, as noted here and here, that there are many counties in which there is only one issuer of a particular level and type of plan. For Silver PPO plans, for example, in which one wants a medium level of cost sharing but wants at least some freedom in selecting a provider, of the 2,512 counties, 20% of the counties have no issuers with such a plan while another 36.6% have only one such issuer. Only 13% of the counties have three or more issuers of these plans. The pie chart below shows the distribution of issuers.
Or, suppose one simply wants a bronze plan of any sort. What we see is that 16.2% of the counties apparently have no such plan, 27.9% have only one issuer and 31% have 2. Thus, only about one third of the counties have 3 or more choices for a simple bronze plan. The pie chart below shows the result.
Sometimes the human eye and the human brain, magnificent as those organs are, do not see patterns that in fact emerge when studied through the lens of statistics or machine learning. Modern computers and statistical activities make it easy to go beyond eyeballing data. What I have done, therefore is to merge representative premium data with data on the population density of each county and see if any statistically significant relationship emerges between the number of issuers within each county and the county representative price.
I want to start with the simplest model: a linear relationship between the number of issuers and the county representative premium. I will do the analysis at first for my baseline Silver PPO purchased by a couple age 40 where I use the median price of the issuer if they sell more than one Silver PPO within the county and the median price of issuers . The graphic below shows the results. There is a statistically significant relationship between the number of issuers and the premium. For each additional issuer, the gross premium goes down by about $16. The model overall, however, accounts for only 2.1% of the variation in representative county prices, meaning, roughly speaking, that 98% of the variation in premiums is correlated with factors other than the number of issuers.
The problem with leaping from this finding to an attempted vindication of claims about the virtues of the ACA is that the result, even weak as it is, depends a bit on specification of the model. This gets a little technical, but unless one assumes a priori that there is some good reason to think that the relationship between number of issuers and price is in fact a linear one, restricting the regression to a simple linear model is potentially misleading. Here, for example, I regress the same data on n (the number of issuers), n-squared and the log of n. All of the coefficients in front of the various terms are still significant, but if one looks at the picture one gets a much more complex story. It appears that having one issuer does lead to high prices and that having two issuers may minimize the number of prices. As one increases the number of prices beyond two prices go up again until we peak at four issuers. This model explains almost 9% of the variance in pricing, which is considerably better than the simplest linear model but still not very good. Clearly, pricing is determined by much more than the number of issuers within a county.
The observed pattern when this more complex regression model is used appears roughly to persist for all metal types of HMOs and PPOs except platinum PPOs where we see the price increase as the number of issuers within a county increases. The family type of the purchaser appears not to affect the general shape of the relationship. I am never able to explain more than about 12% of the variance in premium pricing when I use just the number of issuers within the county as my single explanatory variable.
I have some sense that the population density of a county might have an effect on pricing. Perhaps lower density counties are more expensive. Or, it could be the case that higher density counties, which may have fancier equipment, are more expensive. The regression below shows a simple linear regression using two variables: number of issuers within the county and population density of the county. As one can see, the results are little changed. Both variables have effects that are statistically significant but small. As one goes from 1 to 2 issuers, the price drops by about $17 per month. As one goes from a county in which the population density is 4.3 (which would put it in the 10th percentile) to a county in which the population density is 491 (which would put it in the 90th percentile), the price goes up by $7 per month. The model still does not explain much (adjusted R-squared <0.03). Here are the results in more detail.
Again, I can use a more complex specification. Below I show the results of using linear, quadratic and logarithmic terms for both number of issuers and population density. What we see is a complex picture in which having just one issuer appears to persist in causing somewhat higher prices and in which population density plays a small role. But we are still able to explain less than 10% in the variation of premiums. Again, whatever is going on in premium pricing models, is a lot more complex.
A Foray into Machine Learning
I also attempted to see whether a computer could find a formula that predicted county representative gross premiums any better than my statistical models when given free rein to do so. To do this, I loaded the data into a program called Eureqa from Nutonian .com, which basically uses “genetic programming” to find models that predict well. The basic idea is to treat mathematical formulae kind of like strands of genetic material and permit mathematical formulae that perform better to evolve via mutation and “sex” to produce what may be yet formulae. Sometimes it produces amazing results and — well — sometimes it does not. Either way, however, genetic programming and other methods of machine learning are a useful complement to traditional techniques. They help one check whether the apparent incapacity of traditional methods such as regression are an artifact of limited specifications or the result of unavoidable noise in the data.
In this case, Eureka basically found little. It found some functional forms a human might not come up with such as the one below, which appeared to predict decently, but in fact did not do any better than the models I developed by hand. The foray into machine learning suggests, then, that the limited ability of our our statistical models to predict well is not the result of a failure to specify the model correctly but rather the result of noise in the data and unobserved variables.
Unfortunately, perhaps, the results shown here are not the sort one writes home about or that get on the front page of either scholarly publications or news reports. They are kind of “meh” results. Maybe market concentration has an effect, but, at least as revealed by the data here it is small. So, why might this be?
1. Perhaps the number of insurers in the Exchange is not as relevant anymore as might be thought. Given the availability of individual policies off the Exchange in some states, the number of individual polices within the Exchange may not be as important. I don’t have the data on off-Exchange policies and neither, so far as I know, does anyone else.
2. Maybe pricing is determined more by the identity of the insurer than the number of insurers. Suppose, for example — and I do not say this is true — that Blue Cross made different assumptions about adverse selection and moral hazard with the purchasing population than did, say, United Healthcare. Markets that Blue Cross entered aggressively might thus have lower representative county prices than markets in which they did not. Or suppose that Blue Cross was able to use market power and/or superior skill to create narrower networks that nonetheless satisfied regulators. This might account for markets in which Blue Cross was present exhibiting lower prices. Or suppose that Humana was more willing to take a loss the first year in order to supposedly lock in business than was Blue Cross. This too might explain lower pricing. This suggests another experiment in which one looks at pricing as a contest and seeing how each of the competitors fared against each other.
3. Maybe consumers are very sophisticated such that “Silver PPO plans” are not comparable. If consumers, for example, value the precise package of benefits and providers offered by, say, Blue Cross in a county as being quite different from the precise package of benefits and providers offered by, say, Humana, then we can’t just count issuers in determining the level of competition in a county.
4. Population density isn’t the right variable to include. Maybe what we need is some measure of medical pricing by counties. Or maybe, as the Wall Street Journal suggested, we need to include some measure of income or income inequality. Sadly, it may be that healthcare costs more in poorer counties, perhaps because the poor have more serious health problems. At the moment I have not included those variables. Future examinations of this area should probably do so insofar as the data permits.
Ordinarily, it would be my practice to make the Mathematica notebooks used to conduct this analysis fully available. I very much believe in transparency. Unfortunately, this analysis was conducted using features in a beta version of Mathematica 10 and I have signed a non-disclosure agreement with respect to that software. While I received consent to show certain results from use of that software, I did not request or receive consent to show code. Moreover, the code would not work on computers that do not have Mathematica 10. I commit to releasing the code as soon as Mathematica 10 is out of beta. I don’t think my NDA stops me from saying, however, that Mathematica 10 looks somewhere between absolutely spectacular and completely mind-blowing.
Exploring the likely implosion of the Affordable Care Act