Please notice that, here, density is expressed in 000’s (thousands) inhabitants per km²

Both population and density of population are significant factors in explaining the COVID-29 outbreak in France. The regression gains in signification (see F and T statistics)

Notice that, as anticipated, the slopes of the least squares adjusted lines are rising. A decline will signal that the peak in the pandemic has been passed.

This is an analysis based on public data, and subject to revisions or errors including the processing.

**Data sources:** Géodes – données en Santé Publique, INSEE.

## Analysis

Multiple Regression – COVID19 in hospitals 200327

Dependent variable: COVID19 in hospitals 200327

Independent variables:

Population

Th. Inhabitants per km²

Standard T

Parameter Estimate Error Statistic P-Value

CONSTANT – 19.387 24.1899 -0.801452 0.4248

Population 0.000213978 0.0000317997 6.72892 0.0000

Th. Inhabitants per km² 61.2939 6.72891 9.10904 0.0000

Analysis of Variance

Source Sum of Squares Df Mean Square F-Ratio P-Value

Model 4.75232E6 2 2.37616E6 116.28 0.0000

Residual 2.00262E6 98 20434.9

Total (Corr.) 6.75494E6 100

R-squared = 70.3532 percent

R-squared (adjusted for d.f.) = 69.7482 percent

Standard Error of Est. = 142.951

Mean absolute error = 84.8147

Durbin-Watson statistic = 1.18944 (P=0.0000)

Lag 1 residual autocorrelation = 0.402385

The StatAdvisor

The output shows the results of fitting a multiple linear regression model to describe the relationship between COVID19 in hospitals 200327 and 2 independent variables. The equation of the fitted model is

COVID19 in hospitals 200327 = -19.387 + 0.000213978*Population + 61.2939*Th. Inhabitants per km²

Since the P-value in the ANOVA table is less than 0.05, there is a statistically significant relationship between the variables at the 95.0% confidence level.

The R-Squared statistic indicates that the model as fitted explains 70.3532% of the variability in COVID19 in hospitals 200327. The adjusted R-squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 69.7482%. The standard error of the estimate shows the standard deviation of the residuals to be 142.951. This value can be used to construct prediction limits for new observations by selecting the Reports option from the text menu. The mean absolute error (MAE) of 84.8147 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the P-value is less than 0.05, there is an indication of possible serial correlation at the 95.0% confidence level. Plot the residuals versus row order to see if there is any pattern that can be seen.

In determining whether the model can be simplified, notice that the highest P-value on the independent variables is 0.0000, belonging to Population. Since the P-value is less than 0.05, that term is statistically significant at the 95.0% confidence level. Consequently, you probably don’t want to remove any variables from the model.

## Charts