Safe traveling in public transport amid COVID-19

Safe travel in public transport amid COVID requires wearing a mask properly, showing effects similar to 2-m social distancing.


Analysis of inhalation effects when wearing a mask.
The effect of analysis on the mandatory wearing of a mask, one of the most important topics in this study, was conducted as follows. The mask effect analysis was performed by separating the effects on the inhalation and exhalation with the mask. The analysis was conducted based on the previously verified inhalation effects, and the results indicated that KF94 and KF80 masks exhibit 98.7% and 87.6% anti-inhalation effects (17), respectively.
1.2 Analysis on the degree of diffusion of droplets with or without wearing a mask.
In this study, we examined the change in particle concentration based on the distance from the source of aerosols due to coughing. To this end, we conducted an experiment to determine the degree of spread of imitated virus particles based on whether a mask was worn. To analyze the probability of infection according to the distance between individuals, the diffusion distance of aerosols based on the presence or absence of a mask was analyzed via direct experiments. The purpose of this study is to determine the degree of transport of infectious substances through aerosols based on the degree of response by individuals to the current policy of wearing masks in South Korea.
In the experiment, we hypothesized the situation of spraying aerosols in air via cough. The sprayed aerosols can be divided into three categories, namely, droplets larger than 5 µm, micro-, and nano-sized aerosol droplets. It was observed that the droplets bulkier than 5 µm fall quickly on the ground and have an influence of approximately 2 m. However, the micro and nano sized aerosols remain in air for a longer time and can hence spread further. We examined the variation in concentrations of micro-and nano-sized particles with respect to the distance from the source of aerosol based on whether a mask was worn. Fig. S1 illustrates a Cough Aerosol Simulator (CAS) for producing micro-sized cough aerosols. A spraying device (DIA-3530, Furupla, Japan) was used to produce aerosols. Additionally, an OPC (Optical Particle Counter, 11-D, Grimm, Germany)) was used as a measuring instrument to measure the aerosol size. Some modifications were made to the Cough Aerosol Simulator (CAS) for generating nano-sized cough aerosols, as depicted in Fig. S2 The CAS consisted of a mist generator (VMG, Vitals, Korea), compressed dry air supplier (LTC-7502, Ace power, China), buffer mixer, and two hydraulic valves (HPW2130, Autosigma, Korea) that selectively regulated two flow paths. It had also two ball flow controllers. One was for generation flow control (50 LPM, RMA-23-SSV, Dwyer, USA) and the other was for additional flow control to make total flow (300 LPM, LF401, Unicell, Korea). A total flow rate of 230 LPM was normally vented to the drain. Furthermore, 1.7 L were emitted in a cough duration of 0.44 s through the mouth of a mannequin. This condition was modeled using the sinusoidal function with an average cough flow (1.69 L) and maximum cough flow rate (6.01 L/sec) for influenza patients as observed in a previous study (18). A mannequin was used to imitate a human face wearing a mask. The aerosol was sprayed through the mouth of the mannequin. KF94, KF80 masks were used considering that they are readily available in domestic pharmacies. The mass concentrations of micro-sized cough aerosol particles were measured according to the distance from mannequin before and after wearing the mask to estimate their Reduction rate. Gamble solution representing interstitial deep fluid within the lungs was used to simulate the droplets produced by cough (19). It was prepared by sequentially dissolving the following constituents in sterilized distilled water: 0.095 g/L magnesium chloride, 6.019 g/L sodium chloride, 0.298 g/L potassium chloride, 0.126 g/L disodium hydrogen phosphate, 0.063 g/L sodium sulfate, 0.368 g/L calcium chloride dihydrate, 0.574 g/L sodium acetate, 2.604 g/L sodium bicarbonate, and 0.097 g/L sodium citrate dihydrate. This reagent was supplied by Sigma-Aldrich (Mo, USA). The pH of solution was adjusted to pH 7.4 using 37% HCL (Acros Organics, NJ, USA) after the dissolution. The SARS-CoV-2 has a diameter of about 70 ~ 90 nm. TiO2 particles (Merck, USA) were used to simulate virus particles because, in our previous study (21), the geometric mean diameter of ~ 80 nm was obtained from TiO2 distilled water solution spray. Accordingly, we prepared a 5% TiO2 Gamble solution to mimic the pulmonary fluid and virus particles and sprayed it using the mist generation illustrated in Figs. S1 and S2. The experiments were performed in an exposure chamber with a length of 4 m, width of 3 m, and height of 2.4 m. The exposure chamber was maintained at a temperature of 22 °C and a relative humidity of 50%. During the exposure assessments, the number of ventilations was maintained at 0.5/hour to minimize airflow. The flow velocity was 20 cm/min. Between the exposure assessments, the number of ventilations was increased to 30/hour to clean the air inside the chamber and sufficiently reduce the background particle level of atmosphere in exposure changer. Clean air that had passed through the HEPA filter flowed from the top inlet to the bottom outlet.

Study limitations and suggestions for future research
The scope of this study was limited to examining the formation of cough aerosols and their blockage by a mask. Moreover, this study assumed that a mask is worn properly and fits perfectly (i.e., no leakage), and that there is no additional airflow. However, the effects of several other related factors must be considered, such as the type and design of mask, and a user's head shape and wearing behavior, as these factors may cause aerosol leakage if a mask is poorly fitted. In addition, the use of air conditioners and the effects of wind can alter the distribution of aerosols. The effects of these factors can all be examined in future studies.

Fig. S1. Cough Aerosol Simulator (CAS).
This device was used to measure the distribution and mass concentration of micro-sized aerosols using OPC (11-D, Grimm, Germany) at locations 0.5, 1, 1.5, 2, and 3 m distant from the aerosol outlet. The inlet for sampling was parallel to the line of flow to ensure the best efficiency. The aerosol generation height was 1.6 m and the sampling height was 1.5 m, taking into account the slight downflow. This device was designed to measure the distribution and mass concentration of nano-sized aerosols using ELPI (ELPI+, Dekati, Finland) at locations 0.5, 1, 1.5, 2, and 3 m distant from the aerosol outlet. The airflow for mist generation and coughing was 10 lpm and 220 lpm. The mannequin's solenoid valve (cough valve) was closed and the drain valve was open with mist generation air flow and without coughing airflow. During coughing, the drain valve was closed and the cough valve was open. The experiment showed that when the mask was not worn, approximately 10% of the particles in the solution representing cough aerosols spread farther than 2 m. In contrast, when the mask was worn most of these particles were blocked by the mask, and the most of the particles that passed through the mask were smaller than 576 nm, as shown in Table S1   This experiment focused on the particle sizes mentioned in Fig. S3 and Table S1. The results of the analysis indicated that mask-wearing blocked most particles 576 nm or larger in size, and that only a few particles smaller than 576 nm passed through the mask. This result proves that masks can block most particles in the solution, which represent cough aerosols. The number of viruses is likely to be proportional to the fluid volume of the particles. Particles 576 nm or larger in size, which exerted a significant volumetric effect, did not pass through the mask, and mask-wearing reduced both the number and volume of the particles reduced by more than 99.8%. Therefore, we assumed that the reduction in the infection rate is related to the reduction in the number of particles. However, on assessing the safety of travel in public transportation, we observed that the interiors of public transportation vehicles were not securely ventilated. Moreover, the degree of contact between passengers and the contact time were relatively high, which can increase the risk of infection. Therefore, the experiment was

The comparison of exhalation/exhalation and inhalation compound through a mask Exhalation of Mask Exhalation and Inhalation of Mask
conducted with a focus on nanosize particles that can pass through a mask and thereby pose an infection risk. The experiments showed that when a mask is not worn, virus particles can spread farther than 3 m and that wearing a mask limits this distance to less than 2 m, even if the particles pass through the mask. We determined that mandatory mask-wearing results in COVID-19-preventive effects similar to those resulting from maintaining 2m physical distancing. We applied these experimental results to the diffusion model of a public transportation scenario.

Construction trip chain using smart card data
In this study, we constructed real travelers' trip chains using smartcard data and household travel survey data. Smartcard data contain the encrypted identifier (ID) of the transit user, their location, their boarding time, and their time of disembarking, as shown in Table S3. To realize the encounter network for public transport journeys, we composed the trip chain of each user using their smartcard data ID-specific travel record. Furthermore, based on the configured trip chain, each user's time zone origin-destination (OD) pair and activity were tracked. Finally, we used household travel-survey data to construct the trip chains outside of the public transportation system. However, the data provide information about homes and workplaces based on zone units. Therefore, this study focused on encounters within the public transportation system along the total trip chain, as shown in Fig. S5.  S5. Construction of trip chains using smart card data and household travel-survey data.

Public Transportation Trip Assignment using the agent based Model
In this study, we implemented the trip chain based on smart card data and performed traffic assignment under the agent based model. However, when using a smart card composed of only OD pairs, the movement path information of individual users is not provided. Hence, it is necessary to estimate the movement path of each individual with respect to time through the process of traffic assignment. Based on these traffic assignment steps, it is possible to implement an individual user's movement route with respect to time zone, congestion of each section, and encounter network. We performed the traffic assignment in the shortest route in public transportation by referring to the manual that explained as to how to use the multi-agent transport simulation MATSim (24,14).
The study was conducted with MATSim, which is a transportation simulation focused on transportation planning (25), and the user's shortest route estimation (26). The shortest route-based traffic assignment model is as follows.  (2)) where, sit : Occupancy level when no seats are available , : Parameters denoting percentage increase in discomfort because of standing in a vehicle : Maximum percentage increase when the vehicle is full We set the congestion level for each section of the corresponding route with respect to time zone, as illustrated in Table S4. and Fig. S6. Subsequently, based on the constructed result, we implemented the encounter network to calculate the contact duration. We analyzed this congestion with respect to each section and reflected it in the infection rate of each section. Furthermore, Fig. S6. shows a map of congestion in the Seoul Metropolitan Subway Network via transit assignment using MATSim. In this study, congestion indicators were used, and the basic data for calculating them were large direct current electric vehicles operating in Seoul, which used the standard standards for urban railway vehicles. The passenger quota is also defined as having an area of 0.35 2 per person, with 160 passengers per vehicle, with a 100% congestion. The more in this, the lower the level of service, the higher the congestion, and the lower the average distance between each agent, which increases the probability of infection of respiratory diseases. When there is about 170% congestion, there is virtually no empty space in the vehicle, where each agent is closely related to each other in terms of possible infection. In this study, the area per person was assumed to be circle and converted into the distance between individuals. As a result, the 100% congestion-based agent is found to be about 1.3m apart, and the 160% congestion-based distance is 1m, except for the area occupied by the human body, there is virtually no empty space. We divided level of service in public transportation according to congestion level (27).
Next, we combined the average distance between the individual agents and the changes in the particle distribution according to the distance. This combined value is reflected by the change ( ) in the probability of an individual being infected. We can express the ratio of the number of particles to the distance, which has nonlinear characteristics, as a linear regression function (Fig. S7), where x is the distance and y is the number of particles observed over this distance. The coefficient, y-intercept, and R^2 are −7,011.2, 40,813, and 0.9476, respectively. As the minimum distance between the individual agents is 0.51 m, we also set the minimum observational diffusion value to 0.5 m. Ratio is set to 1 at a minimum distance of 0.5 m between agents; as the average distance increases, Ratio decreases according to the linear regression function as follows.

Familiar Stranger group in Encounter Network
In the public transportation passenger encounter pattern studied by Sun and colleagues (6) through an encounter network, each node represents a passenger and each edge represents an undirected graph that represents a pair of passengers staying in the same vehicle. Smart card data, including passenger commuting time, location, and corresponding bus ID, were analyzed to form a network. The network was expanded by Sun and colleagues (6), and they proposed a time-varying weighted Public Transportation Encounter Network (PEN) that can model an infectious process. Specifically, PEN is an ideal tool for investigating the spread of infectious diseases through public transportation services because it provides passengers information on direct contact with infected cases. First, we divide the entire study period by an even time interval = 1, … , of length . With respect to the weighted graph ( , ℰ , ) , ={ : = 1, . . . , }, the node set is used, where each node represents an individual, and N is the total number of passengers. ℰ t denotes the edge set and t denotes the weight set. The edge between and ( , ∈ ) denoted by remained in the same vehicle when considering all and values during the time interval . The weight of expressed as is = . Furthermore, denotes the duration of and staying in the same vehicle for an interval . By definition, 0≤ ≤1. Weights are used to capture the fact that epidemic transmission is related to the duration of contact.

Social-activity contact network
There are various other situations, including public transportation, wherein passengers can come in contact with each other. These situations include social activities (SA) where passengers are heterogeneously distributed in terms of space. However, to capture the actual connectivity of passengers during daily activities, rich datasets (e.g., mobile data, GPS data) are required. In this study, we made the following assumptions to build the SA contact network.


Global interaction: Passengers can interact with other individuals in the system with a uniform probability during time interval t.  Local interaction: Passengers of the same origin or destination of a PT trip can interact with each other with a uniform probability during time interval t. > holds because local interaction is stronger than global interaction. In global interaction, it is assumed that the contact time of all connected individuals, when there is no local contact with PT, is for a specific time interval. Conversely, the contact time should be reduced by PT and local contact duration (CD) at the time interval. In local interaction, the contact time calculation is as follows. There is a passenger whose PT trip sequence is {( 1 , 2 ), ( 3 , 4 )}, where denotes the time when the passenger boards the vehicle. Furthermore, ′ denote the trip origin and destination, respectively. A trip sequence is defined as a sequence of consecutive PT trips, where all adjacent trip pairs exhibit an interval of less than 24 h (e.g., 3 − 2 < 24ℎ). Additionally, 2 ≠ 3 may hold because the passenger may not be able to stay in the same place between two consecutive trips. It is also assumed that from 2 to 3 the passenger spends half of the activity time at 2 and half of the activity time at 3 . The trip sequence of passenger is ( 1 , 2 ), ( 3 , 4 ), with ′ 2 = 3 . There is no overlapping time between the intervals [ 2 , 3 ] and [ ′ 2 , ′ 3 ]. The and passengers stay at the same location ′ 2 = 3 . Hence, they may have a local contact (the possibility of local contact is indicated by ). It is assumed that the passenger spends half of the time of an activity at a specific origin or destination. If they have a local contact, then the CD between passengers and is half the time of overlapping between interval [ 2 , 3 ] and interval [ ′ 2 , ′ 3 ]. This calculation provides the total CD of and at the local interaction level. For example, if 2 < ′ 2 < 3 < ′ 3 , then the total local CD between and is 1 2 ( 3 − ′ 2 ). Similarly for PEN, the entire local CD can be mapped to each time interval. For example, * is the time boundary for time interval t and time interval t + 1, and * − < ′ 2 < * < 3 < * + to be. We denote the local CD between and for time interval as ̃, (0 ≤̃, ≤ ). Subsequently, ̃, = * − ′ 2 and ̃, +1 = 3 − * .
The SA contact network is ̃( , Ẽ , , Ẽ , ,̃,̃), where Ẽ , denotes the edge set of global interaction and Ẽ l,t denotes the edge set of local interaction. The edge of the global interaction between all and denoted by ẽ , exists as a probability for all , ∈ . When and share the same PT trip origin or destination during time interval t, the edge of local interaction between and (ẽ , ) exists with probability. Furthermore, ̃ and ̃ denote the weights set on the global and local interaction edges, respectively.
We developed an encounter network on the public transportation network based on the OD pair of smart card data wherein the traffic assignment was completed. Furthermore, we built an actual public transportation network based on the operational routes and timetables with respect to time zone with the goal of realizing accurate public transportation users. During the construction of encounter network, we analyzed the users, including existing occupants, passengers, and future occupants, who met for a minimum contact time of 10 min. Subsequently, a familiar stranger group is created based on the individual users and estimated encounter network. Hence, the standard of encounter is as follows. . S8. Encounter Network analysis.

Infectious Pathways
In South Korea, the Centers for Disease Control and Prevention (CDCP) provides the actual path of infection through epidemiological investigations. This data is provided in the form of OD pair for each time zone, similar to that in smart card data, from 00 days before the actual infected person receives confirmation of infection until the time when they receive confirmation or return to hospital or are in quarantine. In this study, we exploited data of real infected individuals available from the epidemiological survey data. As in the previous methodology, we assigned the OD pair for the movement of the infected person through the traffic assignment based on the shortest path route selection. Furthermore, the CDCP released data on epidemiological investigations of individuals who received COVID-19 confirmation on February 25, 2020. In the study, we determined that an individual used the public transportation from Yeouinaru Station to Songjeong Station on February 22, 2020 by using the smart card data and data on real infected individuals. Hence, in the study, we estimated and applied the intermediate stage movement (time zone for each section, etc.) during the movement, as shown in Table S6.

Applying an SEIR structure
We performed infection diffusion pathway analysis based on the analyzed individual encounter network. In a previous study, an SEIR structure was applied as an infection diffusion path. The SEIR model generally represents the degree of spread of respiratory infections and divides the population into four stages. In this study, an SEIR structure was applied according to the contact duration (CD) calculated from the PEN in the public transportation network. We first classified the population into four classes according to the stage of the disease (28, 29, 30): • Susceptible (denoted by , those who can contract the infection) • Exposed ( , those who are infected with the disease, but they are unable to spread or can only spread the virus with low probability) • Infectious ( , those who are infected and are contagious) • Recovered ( , those who are removed from the spread, either because they have recovered from the disease with an immune process or they have died).
By definition, = ∪ ∪ ∪ , and is the set of the entire population. The SEIR model can explain as to how an individual passes through each compartment of the model. The infection rate controls the proliferation rate and is related to the probability of disease spread between a susceptible (S) and exposed individual (E). The incubation rate, , is the rate at which an exposed individual (E) becomes infectious (I). The removal rate, , is the sum of recovery and mortality due to COVID-19. In the SEIR model, it is generally assumed that given the immunity obtained, the recovered individual will not be infected again. This study focuses on the early stages of the epidemic process, where the effect of external factors on (e.g., birth and natural history) is not considered. In the case of the epidemic process model, individuals are concerned about the steady state, epidemic threshold, and reproduction number. According to Pastor-Satorras and colleagues (31), the number of individuals infected in the SEIR model tends to be zero even after prolonged periods (see Fig. S11) (31,32). This can be seen in the SEIR structure where only state R is the only absorbing state (e.g., no transitions out). The default reproduction number represented by is defined as the average number of secondary infections due to primary cases introduced into the fully susceptible population (28). In the standard SEIR model, = . In many cases, the pandemic threshold is defined based on the value of . When < 1, the number of infectious individuals tends to decline exponentially, which indicates no epidemic. However, if > 1, the number of infectious individuals may increase exponentially due to the occurrence of an epidemic, as shown in Fig. S9. and it is analyzed using the SEIR model, which is widely used for modeling infectious diseases. The movement between the four groups is considered according to parameters such as the transmission rate, recovery rate, average infection period, and incubation period of infectious diseases. The relevant parameter is calculated and applied to the domestic situation based on statistical data from the Korea Centers for Disease Control and Prevention. Furthermore, January 21, when the first case was confirmed, is set as the start date of the analysis, as shown in Table S7.  The typical model of an epidemic can be divided into two categories: individual-based and degree-based approaches. The individual-based approach models epidemic transmission at the individual level, and the degreebased approach captures the infection process at the group level. Each group contains a set of nodes (individual) of the same level. In the PEN, we can characterize the behavior of human interaction at the individual level. Therefore, individual-based framework was chosen in this study. The Seoul Metropolitan Subway is the world's fourth largest subways after in Beijing, Shanghai, and Guangzhou in China. The length of a single compartment is 20 m, thus, the total length of a train is usually 160 m to 200 m, with 8 to 10 compartments, respectively. In this large subway, it is possible for passengers to avoid contact with each other even if they board the same subway at the same time. Therefore, we defined the set of compartments with individual passengers on board to build an individual-based framework. Because it was difficult to confirm which compartment the actual individual agent was on, we assumed that the magnitude containing the contacted passenger was inversely proportional to the number of compartments on the subway line.
In an actual public transportation setting, users tag their smartcards at an entrance and move further ahead to board a subway. We used artificial intelligence techniques composed of a many-to-many type of deep neural network to track the compartments containing boarded agents according to the following procedure. First, according to the constructed trip chain, we tracked the agent's path from outside to inside the subway, and to the entrance where the smartcard was tagged. This was possible because we were able to obtain records of the first origin, destination, and public transportation usage of agents. Second, we defined the locations where the smartcard was tagged in the subway station, and the location of the nearest stairs and elevators, and thus predicted which compartment was selected after the smartcard was tagged. Third, we verified the actual number of passengers according to the weight of each compartment of the subway in Busan using the root-mean-square error (RMSE). In the Busan Metro system, a load-compensating sensor is installed in each compartment, and the data from this sensor enabled us to determine the number of passengers per compartment, as shown in Fig. S10.   Fig. S10. Tracking the compartments containing boarded agents using artificial intelligence techniques.
However, the parameters obtained from the Busan Metro cannot be generalized to the Seoul Metropolitan Subway, which is the focus of this paper. We expect the predictions based on real data to be applicable to our current study.
We describe whether the individuals are in classes S, I, E, and R at the time interval t with the Bernoulli random variables , , , , , , and , . By definition, , + , + , + , = 1 for all and t. Let ℙ( , = 1) = , , where X∈{S, I, E, R} and ∑ , = 1. Given that the contact network was defined as discrete time, the epidemic process of the SEIR model can be described as a discrete Markov process with a certain probability of transition.
To match the epidemiological properties of COVID-19, it is assumed that exposed individuals can infect others based on recent findings (33), which may not be common in SEIR models. If and come into contact with each other during the entire time interval (by PT or SA), then the probability of an individual ∈ S to be infected by an infected person ∈ I at time interval t is termed . Given that the actual transmission probability is related to the duration of the interaction, the actual probability of infected with ( , , ) is as follows: , , = • ℎ( , ) • +̃, • ℎ(̃, , ) +̃, • ℎ(̃, , ) ∀ ∈ , ∈ .
where h(•, •) denotes a function that describes the actual transmission probability for a CD. It can be in the form of a survival function (e.g., exponential, Weibull) or a linear function (i.e., h( , ) = used in case studies). While (̃, ,̃, ) is a variable that indicates whether ℯ (ẽ , , ẽ , ) exists, is a known constant. However, ẽ , and ẽ , are arbitrary variables obtained using the Bernoulli distribution: ̃, ~ ( ) ̃,~( ). = 1 if and share the same origin or destination at time interval t, otherwise =0. Furthermore, the function reflects the linear average distance of individual ∈ S and is calculated based on the level of public transport congestion. The reduction in infection probability with average distance was determined as conservatively as possible; that is, we considered the effect of distance based on the aerosol diffusion distance when no mask was worn. In the no-mask experiment (Table S1), the average distance between agents was considered to be 0.51 m under the highest level of congestion (170%). We then applied ratio to the number of particles, which decreased as the average distance between individual agents increased. Thus, the following holds.
Similarly, we define as the probability of ∈ S to be infected by the exposed individual ∈ E at time interval t if and are in contact with each other for the entire time interval ( ≪ ). The actual transmission probability considering the duration of the interaction is as follows: , , = • ℎ( , ) • + • ℎ(̃, , ) + • ℎ(̃, , ) ∀ ∈ , ∈ .
We assume that if {i} and {j} are in contact, then the probability of transmission depends only on the CD. The average distance between individual agents was used to calculate the change in transmission probability due to changes in the spatial distribution. This average distance was calculated based on the level of congestion in public transport, which was determined on the basis of the number of passengers on board and the actual area of a public transportation vehicle plying in Seoul. Specifically, denotes the probability of E→ I, which is not network related, and denotes the probability of I→ R and = + . Furthermore, and are the probabilities that infected individuals will be treated and die, respectively, and neither of the probabilities are related to the network. Thus, using the notation and epidemic transmission mechanism, the following system equations can be written as follows: , +1 = , − , ( + ) + , , +1 = , − , ( + ) The calculation of ℙ( , = 1, , = 1) requires a common distribution of , and , that are not typically available. According to the individual-based mean field approximation, it can be assumed that the states of the neighbors are independent (34,35,36).
ℙ( , = 1, , = 1) = , , Therefore, we substituted Eqs. 14 and 15 into Eqs. 10 and 11 to obtain a new group of system equations that can be solved. The research analyzed the encounter network with respect to time zone based on smart card data. In the case of public transportation, the characteristics of weekdays, weekends, and peaks are distinct. Hence, we performed an analysis by classifying the encounter network with respect to weekdays, weekend, and time zone into degrees and contact durations. We assumed an interval of 1 h. In the trend line in the graph above, a quadratic polynomial is selected. The encounter network is determined by the traffic volume for each section of the time zone and the OD pair of individual traffic, as shown in Fig. S11. and Fig. S12.   Fig. S11. Encounter Network Tracking of a five-passenger system.

Mobility patterns in Seoul Metropolitan subway
Seoul is one of the most public transportation friendly cities in the world. With the introduction of various demand management policies, more than 70% of people in Seoul use public transport (37). We validated the impact of the Encounter Network by including cases of real infected people who were using public transportation. The assignment of the environment in which the actual infected person used public transport was made using the activity-based model of the actual movement records of real infected people described in section 2. First, we analyzed the high-demand routes in the encounter network, which varied over time, for specific validation. We tracked the actual infected person from the moment they boarded the compartment to the time of first encounter up to the fourth encounter. First, Line 5 was analyzed at 11:00 a.m. on February 22, 2020 (weekend), when the number of actual infected people was low, and the results are shown in the left-hand panels of Fig. S13. Second, Line 7 was analyzed at 9:00 a.m. (peak) on February 28, 2020 (week), when actual infected people used the line, and the results are shown in the middle panels of the figure. Third, Line 2 was analyzed at 9 p.m. on February 26, 2020 (week), when many infected people used the line, and the results are shown in the right-hand panels. Line 2 had the highest demand, as it passes through all of the core areas of Seoul, and even though it was a non-peak hour, the highest number of encounters occurred on this line, as shown in Fig. S13., Fig. S14. and Movie S1 to S3.

Model parameters and adjustment factors in this study
Currently, South Korea is responding to COVID-19 by implementing many measures, including the introduction of a work system with social distancing. Due to these measures, the actual rate of use of public transportation has sharply decreased. In this study, despite using smart card data, the congestion was recalculated by reducing the amount of public transport traffic with respect to time zone based on the actual reduced traffic ratio. Given that the degree of exposure in this study is closely related to the congestion in public transportation, the realistic infection risk in the current response situation is calculated based on the recalculated congestion, as shown in Table S9.
4.2 "Uncontrolled" and "Controlled" situations in South Korea and Policy proposal In this study, we used the encounter network analysis to verify the SEIR model for the case when actual infected individuals use public transportation without wearing a mask and no social distancing is maintained with others. We calculated the hazardous area using the model. However, in the current COVID-19 response situation, the countermeasure policies, such as wearing a mask and maintaining a social distance, are implemented. Due to these policies, the spread of infectious diseases has been significantly prevented. Specifically, it is considered that wearing a mask and maintaining social distance can significantly reduce infection in public transportation. We therefore performed additional analysis of a case study wherein it was assumed that individuals wore a mask and maintained social distancing in the current situation. Figures A and B of Fig. S15. show the proportion of exposed and infected agents, respectively. This plot is obtained by applying the data to the public transportation network with respect to date based on the SEIR model. As shown in the graph, the exposed and infected agents are reduced when social distancing and wearing of masks are implemented.
Fig. S15. Exposed and Infected trend based on the date of each case.
First, we use the public transportation smart card data for building a trip chain for passers-by and predicting individual route. Then, through the encounter network analysis, we identify the passengers who were exposed. By considering the case of wearing masks and congestion level, we apply the SEIR Model (especially exposed) to analyze the degree of exposure to the virus. Thus, it is possible to reduce exposure through two methods. Based on this, a social distancing policy can be proposed to provide detour routes from the user's side, adjust the operation schedule from the supplier's side, set the central quarantine section at the government level, and control the congestion of public transportation, as shown in Fig S16. Table S11. Number of exposed agents in public transportation over time. In case 1, social distancing and wearing of masks is not implemented. This causes a sharp increase in the number of infected cases on a daily basis since its initial generation. In case 2, the number of absolute agents decreased when the mask was not worn. However, the probability of the exposed agent was high, resulting in a certain amount of exposure agent and a 39.8% decrease after 30 days when compared to that in case 1. In case 3, social distancing was not performed. However, the probability of exposure was significantly reduced due to wearing a mask, resulting in a 95.8% decrease after 30 days when compared to that in case 1. In case 4, when both policies were implemented, the number of exposed agents did not change significantly, and the number of exposed agents differed significantly over time when compared to that in case 1, thereby showing a 96.6% decrease after 30 days, as shown is Table S12.