Census Bureau statisticians and experts from outside are trying to solve a mystery.

According to agency documents, residents did not answer a variety of questions about their sex, race and Hispanic heritage, family relationships, age, or a count of how many people lived in their home. Statisticians were required to fill in the gaps.

Reflecting an early stage in the number crunching, the documents show that 10% to 20% of questions were not answered in the 2020 census, depending on the question and state. The Census Bureau says that the actual rates were lower in later processing phases.

According to Steven Ruggles, University of Minnesota demographer, the rates ranged from 1% to 3 percent in 170 years of U.S. Censuses.

This information is vital as it will be used to draw congressional and legislative districts. That data, which the Census Bureau will release Thursday, also is used to distribute $1.5 trillion in federal spending each year.

These documents were made public by a Republican redistricting advocacy organization in response to an open record request. They don’t provide much insight into why certain questions weren’t answered, but theories abound. Some experts believe that the software used for the first census where most Americans were able to respond online allowed people skip certain questions. Others claim that the pandemic made reaching people who weren’t responding more difficult.

Experts suggest that there may be a deeper cause, though confusion over certain questions (including traditional Hispanic uncertainty about how to answer the race question) could have played a role. The Trump administration’s attempts to end the count earlier and its failed attempts to include a citizenship question on forms and exclude illegal immigrants had a chilling impact.

“I believe it’s Trump and the pandemic. Andrew Beveridge, a Queens College sociologist and the City University of New York Graduate School, said that the mere threat of citizenship being on the questionnaire may have discouraged some Latinos not filling it out. It is a shocking statistic that I think many of us are shocked by. It’s a high number.

Ruggles thought that it was due to the online software used by most Americans. This is about two-thirds. Similar software has been used in Canada and Australia. The number of unanswered question dropped to nearly zero in other countries like Canada and Australia. Respondents couldn’t continue if they didn’t answer a question.

Ruggles stated that “I suppose in the U.S. Version they must have just accepted incomplete responses.” Ruggles stated, “If the non-response rates were consistently high across all response modes, that is just bizarre.”

Ron Jarmin, Acting Director of the Census Bureau, stated in a blog post recently that blank answers covered all types of questions and responses — online, paper, phone, or face-toface.

Jarmin stated that “these blank responses left holes within the data which we had needed to fill.”

Jarmin stated last week that the bureau would not release updated rates until later in the month, but said only that it would “report the correct numbers” in a statement to The Associated Press.

Census Bureau statisticians used other administrative records, such as tax forms and Social Security card applications, to fill in the gaps. They also searched previous censuses for people’s race, gender, and Hispanic heritage.

If available records didn’t turn up the information needed, they turned to the statistical technique called imputation that the Census Bureau has used for 60 years. After past censuses, the technique was challenged and upheld by courts.

Statisticians may have looked at information about one family member, such as their race, and applied that information to the other member who had blank answers. They also assigned a sex to the respondent based on their first names. In some cases, the whole household was not able to provide information so they used data from similar neighbors.

In a recent blog post, Roberto Ramirez of the Census Bureau and Christine Borman said that imputation can improve data quality and accuracy over leaving these fields empty or without any information from respondents.

The Census Bureau in April released state population totals from the 2020 census. These are used to divide up the congressional seats in each state through a once-a decade process called apportionment.

The agency released a slide deck presentation about the high rate of unanswered questions, along with group housing records and the first details about the rate of non-responses, in response to an open records request from Fair Lines American Foundation. The Republican advocacy group sued Census Bureau to get information on how the census was done in prisons, dorms, nursing homes, and other areas where people live in groups. Fair Lines states it is concerned about the accuracy and reliability of the group housing count. It also wants to ensure that anomalies don’t impact the state’s population figures.

With the information showing high rates of imputation, some Republican-controlled states may try to leave college students out of redistricting data, claiming they were also counted at their parents’ homes, to get a partisan edge, said Jeffrey Wice, a Democratic redistricting expert.

Wice stated, “That will prove difficult but would inject more uncertainty into redistricting.”