### general structure

Figure 3 outlines the methodological framework of the study. First, we developed an imported COVID-19 risk measure. The rationale for the measure is that the risk of importation is proportional to the number of entries and the prevalence of COVID-19 in the country of departure. Thus, air travel volumes from selected countries to Seoul and the prevalence of COVID-19 cases in departure countries were multiplied to derive the measure. The main improvement over the measurement from previous studies is that real-time mobile data was used to estimate the volume of air travel, and time-varying detection rates were considered to estimate the prevalence of COVID-19. Use of previous year’s air travel data^{10} and not considering the detection efficiency^{13} were suggested as limitations in previous studies.

Second, the developed measure was used to assess travel-related control measures. Using the measure, we fit our model to the period where the detection rate was assumed to be 100% as both day-of-arrival testing and 14-day post-entry quarantine were in effect. We then calculated the number of expected imported cases during the period when only body temperature examinations, health surveys and declaration of travel records were required. The expected number was compared with the observed number of imported cases to estimate the number of undetected imported cases and the detection rate before mandatory testing.

### data sources

Data were obtained from three sources. First, daily numbers of confirmed COVID-19 cases in Seoul were obtained.^{18} to identify the number of imported cases of COVID-19 in Seoul for the period between January 24, 2020 and June 30, 2020, since the first case of COVID-19 in Seoul was confirmed on January 24, 2020. Second, , the daily number of roaming users from each country to Seoul between January 1st and June 30th were acquired from Korea Telecom (KT). Travelers from Korea to other countries use roaming services to make/receive calls from regions outside the coverage areas of their home networks. KT has the second largest market share among mobile operators in Korea with 31%^{26}; therefore, the KT data is sufficient to estimate the trend of the number of entries in Korea. Third, daily numbers of confirmed COVID-19 cases between January 1 and July 7 were used.^{1} to estimate the prevalence of COVID-19 in countries outside of Korea.

### Construction of the dataset

#### country selection

Countries of interest were selected based on the travel history of imported COVID-19 cases reported in Seoul. For example, if an identified imported case traveled to Italy, Italy was selected. Cases with a history of travel to more than one country (24 cases) and unknown regions (two cases) were excluded from this procedure as the source of infection could not be specified. Thirty countries were selected (Table 3). However, Austria, China, Malaysia, Poland, Singapore, Thailand and Vietnam were excluded from the model adaptation procedure as there were no cases imported from these countries after April 1, 2020, when testing on the day of arrival and 14 days post quarantine. -entry for participants has become mandatory. The inclusion of countries with no imported cases after the implementation of mandatory testing can introduce bias in the model’s estimates.

#### Participants from each country to Seoul

The number of participants from the countries selected for Seoul was calculated using data provided by KT. This data provided the daily number of roaming users by departure country and residential region in Korea during 2020. As we use air travel volume from a single mobile operator, the data does not represent the exact travel volume. However, these data were reported to be representative of domestic market trends.^{27}and volume of international travel^{28}.

#### Estimating the prevalence of COVID-19

The prevalence of COVID-19 in the selected countries was estimated to assess the risk of exposure to COVID-19 among participants traveling to Seoul, Korea. Local prevalence of COVID-19 in selected countries was derived based on the daily number of new confirmed cases^{1}. Reported incidence of COVID-19 is considered underestimated due to incomplete testing^{29.30}. Thus, we extend a previously used method^{11}. Daily testing policy strength for each country was derived using the Oxford COVID-19 Government Response Tracker (OxCGRT)^{31}. The OxCGRT rates the strength of the testing policy as follows: 0- no testing policy, 1-those who have both symptoms and meet specific criteria (key workers, classified as contacts, traveled abroad), 2- anyone showing symptoms and 3-open public testing. Based on the previously suggested notification rate: 0.092 (95% confidence interval [CI] 0.05, 0.20)^{32}, we assign test policies 1, 2, and 3 to report rates of 0.05, 0.092, and 0.20, respectively. The reporting rate for no test policy (0) was assumed to be 0.01. Detection rates from two additional studies were considered as a sensitivity analysis. The results are provided in the Supplementary File.

The methods for estimating the incident ((This} )) and prevalent ((P_{t} )) infectious cases in the day *t*considering the calculation period and the notification rate, it is described in detail by Fauver et al.^{11} The model used prevalent cases as existing cases (prevalent) and new cases (incidents) to serve as sources of infection. Briefly, (I_{t – d – 2}) was estimated using the newly reported cases ((C_{t} )) and the notification fee (({uprho }_{t})) One day *t*. Time from onset of symptoms to isolation (test) *d* was assumed to be 5 days, and cases were considered to become infectious 2 days before the onset of symptoms^{33}.

$$I_{t – d – 2} = frac{{C_{t} }}{{rho_{t} }}$$

(1)

Then, (This}) and the probability of a patient becoming infectious on the day *I* was still infectious in the day *t* has been added to the estimate (P_{t}).

$$P_{t} = mathop sum limits_{i = 1}^{t – 1} I_{i} left( {1 – gamma left( {t – i} right)} right ) + I_{t}$$

(two)

The cumulative distribution function (left( {fleft( x right)} right)) of the infectious period (gamma left({t – i} right)) it was assumed to follow a gamma distribution with mean and standard deviation of 7 and 4.5 days, respectively. As Fauver et al. show, the form ( ({ upalpha}) ) and rate ( (1 / { uptheta}) ) of the gamma distribution was calculated^{34}.

$$fleft( x right) = frac{{uptheta }^{{upalpha }} }}{{{Gamma }left( alpha right)}}x^{{{ upalpha } – 1}} e^{{frac{ – z}{beta }}}$$

(3)

Where ({Gamma }left( alpha right) = mathop smallint limits_{0}^{infty } t^{alpha – 1} e^{t} dt).

The calculated (P_{t}) was divided by the total population of each country in 2020 to estimate the prevalence per 100,000. Participant datasets and COVID-19 prevalence were merged by country and date. The average weekly volume of participants and the prevalence of COVID-19 per 100,000 were calculated using the merged dataset. Finally, the weekly sum of imported COVID-19 cases reported in Seoul was merged into the dataset containing the average weekly number of participants and the prevalence of COVID-19.

### statistical analysis

#### Description of the new measure

The method used by de Salazar et al.^{10} was extended to estimate the expected number of imported COVID-19 cases. The measure indicating the risk of importing COVID-19 was calculated as the product of the number of entries and the prevalence of COVID-19 in the selected countries. Specifically, the expected number of imported cases was assumed to follow a superdisperse Poisson distribution and was assumed to be dependent on the product of entrants ( (EW} )) and the prevalence per 100,000 in the week* W* ((P_{w})).

$$begin{gathered} {text{Expected number of imported cases}} = {text{Quassipoisson}}left( {lambda_{w} } right) hfill \ lambda_{w} = beta_{0} + {upbeta }left( {E_{w} times P_{w} } right) hfill \ end{gathered}$$

(4)

#### Estimating the effectiveness of post-entry quarantine

The model was conservatively adjusted based on data from week 16 (April 13) onwards. Testing on the day of arrival and after the 14-day post-entry quarantine took effect for all arrivals from April 1. However, as the average incubation period is 5.1 days and the 95% percentile is 11.7 days^{35}, many imported cases reported during weeks 14 (2020.03.30–2020.04.05) and 15 (2020.04.07–2020.04.12) could have arrived before April 1. The regression coefficient ({upbeta}) was estimated based on data from week 16 onwards using the maximum likelihood method. Then, the expected number of imported cases was calculated based on the estimated β. An initial sample of 500,000 was used to calculate the 95% CI for the expected number of imported cases. The fit of the model was assessed by identifying whether the reported imported cases were within the confidence intervals of the fitted estimates and using the (R^{2}) statistic.

All results are given by week number. We use data from January 1st to June 30th and the corresponding week (weeks 1-26) as the dates are given in the tables. The number of undetected imported cases was calculated by subtracting the number of reported imported cases from the number of expected cases. As in a previous study, the upper or lower bounds for undetected imported cases were calculated by subtracting reported cases from the upper or lower bound of expected imported cases.^{36}. Undetected cases were presented as 0 if the point estimate or CI of undetected cases was < 0. In addition, the notification rate for imported cases was calculated as a ratio of reported imported cases to expected imported cases.

$${text{Reporting }};{text{rate }}left( {text{% }} right) = frac{Reported; imported ;COVID – 19; cases}{{Expected; imported ;COVID – 19; cases}} times 100$$

(5)

The lower bound for the notification rate was calculated as the ratio of the reported imported cases to the expected upper bound, and the upper bound for the notification rate was calculated as the ratio of the reported imported cases to the expected lower bound. The reporting rate of undetected cases was presented as 100% if reported cases exceeded expected cases. By multiplying the calculated notification rate and imported cases reported after post-entry testing and quarantine policies on participants took effect, the number of imported cases prevented without these policies was calculated.

### ethical statement

The data used was publicly available, completely non-identifiable data collected for disease control purposes or in aggregate form. Thus, no ethical approval was required.