Host cell protein profiling in biopharmaceutical harvests - Analytical

Publication Date (Web): July 26, 2018 ... Abstract. Biopharmaceuticals contain residual host cell protein (HCP) impurities, a complex mixture of endog...
0 downloads 0 Views 2MB Size
Subscriber access provided by University of South Dakota

Article

Host cell protein profiling in biopharmaceutical harvests Darja Obrstar, Frieder Kroener, Bostjan Japelj, Lea Bojic, and Oliver Anderka Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b01236 • Publication Date (Web): 26 Jul 2018 Downloaded from http://pubs.acs.org on July 27, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Host cell protein profiling in biopharmaceutical harvests Darja Obrstar1*a, Frieder Kröner2*a, Bostjan Japelj1, Lea Bojic1, and Oliver Anderka2 1

Novartis Pharma AG, Kolodvorska 27, SI-1234 Menges, Slovenia

2

Novartis Pharma AG, Klybeckstrasse 141, 4057 Basel, Switzerland

*

Both authors contributed equally.

a

Corresponding authors:

Frieder Kroener, PhD Novartis Pharma AG Klybeckstrasse 141, 4057 Basel, Switzerland Phone: +41 61 69 69980 Mail: [email protected]

Darja Obrstar, PhD Novartis Pharma AG Kolodvorska 27, SI-1234 Mengeš, Slovenia Phone: +386 1 7297778 Mail: [email protected]

1 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Biopharmaceuticals contain residual host cell protein (HCP) impurities, a complex mixture of endogenous proteins from production cell lines such as Chinese hamster ovary (CHO) cells. The composition of HCP impurities at harvest hinges on multiple factors, e.g., identity of cell line, cell density and viability at harvest, or other process parameters. Two-dimensional differential gel electrophoresis (2-D DIGE) was used to compare HCP in 15 null cell culture harvest supernatants, which are representative for a wide range of manufacturing processes of therapeutic antibodies, using five different CHO cell lines. Numerical metrics was developed to quantitatively compare HCP composition, which may be used to assess the suitability of a platform HCP assay standard for a new product or to assess the impact of process changes. A very similar HCP compositions was found for the 15 analyzed CHO null cell culture harvests, demonstrating that even the wide range of applied manufacturing processes did not have a strong influence on the HCP impurities. Keywords 2-D DIGE, HCP assay, process-specific, platform, host cell proteins, impurities, similarity Abbreviations 2-D DIGE, 2-D difference gel electrophoresis; HCP, host cell proteins; CHO, Chinese hamster ovary; ELISA, enzyme-linked immunosorbent assay; BCA, bicinchoninic acid assay; PCA, principal component analysis; PC, principal component; FPR, fold pass rate; TR, technical replicate; BR, biological replicate; ER, experimental replicate.

2 ACS Paragon Plus Environment

Page 2 of 27

Page 3 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Therapeutic proteins are typically produced by cultivation of host cells that carry the gene for the protein of interest. CHO cells are most commonly used for this purpose. Here, recombinant proteins are secreted into the cultivation medium and purified from the cell harvest supernatant1. The most abundant host cell-derived impurities are endogenous host cell proteins (HCP). Residual HCP may directly affect quality, safety and efficacy of the biopharmaceutical product2. According to regulatory requirements, a manufacturing process must remove HCP to consistently low levels, and HCP removal through the process must be monitored by a sensitive analytical method3. Residual HCP is typically measured by immunoassay4. A single platform assay may be used within a company to test a variety of products derived from the same type of cell line (e.g., CHO) and cultivated under similar conditions5. For each new product, it needs to be demonstrated that a platform assay is suitable to detect and quantify HCP impurities. Where process changes or a different CHO cell line are introduced, it is critical to evaluate whether these variations render the platform assay unsuitable5. The suitability assessment poses an analytical challenge, since HCP are a complex impurity mixture whose composition can be influenced by many process parameters. Recent studies investigated the influence of upstream process parameters on CHO HCP impurity compositions, using null cells (i.e., cells lacking the gene of interest) or cells producing a therapeutic protein6-10. All studies showed a potential impact of culture viability on HCP compositions. Yuk et al. investigated the influence of different CHO cell lines and process parameters on HCP in cell culture supernatants. Their findings indicate that HCP compositions do not strongly depend on cell line, upstream processes or cell viability. However, in these studies selected host cell protein preparations were analyzed to draw direct conclusions on the influence of individual isolated parameters, e.g., cultivation properties, on HCP composition.

3 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The present study investigated HCP compositions in null cell culture supernatants from a wide range of upstream processes and CHO cell lines, representing actual manufacturing processes for therapeutic monoclonal antibodies. The study was performed with the aim to qualitatively and quantitatively assess differences in the HCP compositions and to reduce the complex differences to a simple numerical metric. Such a metric could be applied, e.g., as a pre-defined requirement to qualify a platform HCP assay standard for a new product or for a changed manufacturing process. To compare complex protein mixtures (e.g., cell culture supernatants), the most commonly used techniques are 2-D gel electrophoresis and LC-MS based approaches. These differ in sample preparation, separation, detection, instrumentation, etc. and the results obtained are considered complementary11-16. Here, 2-D DIGE was used due to its simplicity, robustness, multiplexing capability, inclusion of internal standards, and accuracy in protein quantification7,17-19. To analyze the 2-D DIGE data we applied two mathematical approaches in order to evaluate a suitable numerical metric for “HCP similarity”. Both gave consistent results and overall demonstrated a high degree of similarity of HCP compositions for samples representing a wide range of CHO cell lines and cultivation processes.

Experimental section Cell lines and cultivation processes

Six different CHO cell lines (CHOa, -b, -c, -d, -e, -f) were used. The parental cell lines were mock transfected with an expression vector lacking the gene of interest. In contrast to the production strains (used for the recombinant production of a therapeutic protein), no clone selection of the transfected pool was performed. Cell lines were cultivated using the productspecific upstream processes (process 1–15), which were optimized for the respective 4 ACS Paragon Plus Environment

Page 4 of 27

Page 5 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

production strain (see Table 1). These processes differ significantly in the applied cultivation conditions, e.g., feeding strategy, media and duration (harvest between day 9–17). CHOa_0, the parental cell line of CHO_a was cultivated according to an early version of Novartis upstream platform process (process 0). To examine variations between biological replicates, triplicate runs were performed for processes CHOa_8_BR1–3 and CHOe_1_BR1–3 (i.e., biological replicates, BR1–3). CHOf_1, included as negative control in the similarity analysis, was produced using a distinctively different process applying repeated batch fermentation, all others were carried out in fed batch mode.

Host cell protein preparations

Cell cultures for the host cell protein preparations were performed at scales ranging from 5-300 liters (L). At harvest, cells and cell debris were removed from the supernatant by centrifugation and / or depth filtration. Cleared supernatants were diafiltrated against PBS, concentrated to a total protein concentration of 1–10 mg/mL, sterile filtered and stored at -80°C. In case of the negative control, CHOf_1 the harvest was additionally processed using ionic exchange chromatography (bind and elute) before diafiltration, similar to the 1st purification step in the product specific downstream process (reducing the quantity of HCP by approximately 70%). Total protein concentrations of host cell protein preparations were determined using a Micro-BCA assay (Thermo Fisher Scientific, Waltham, MA, USA).

2-D DIGE analysis of HCP samples

Comparative 2-D DIGE analysis was performed in two separate experiments using the same internal standard. The first experiment included 13 different host cell protein preparations (CHOa_0, CHOa_1–7, CHOb_1, CHOc_1–2, and CHOd_1–2). The second experiment included six preparations from three replicate runs (CHOa_8_BR1–3; CHOe_1_BR1–3) of

5 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

two cultivation processes CHOa_8 and CHOe_1, and the negative control CHOf_1. As an experimental variation control, the parental cell line preparation (CHOa_0) was analyzed in both experiments (i.e., experimental replicates CHOa_ER1–2). All preparations were analyzed in three technical replicates. 2-D DIGE was performed according to manufacturer’s instructions using its equipment and principal consumables (GE Healthcare, UK). Samples were prepared using 2-D Quant Kit, 2D Clean-Up Kit and CyDye DIGE Fluor Minimal Labeling Kit (5 nmol) for labelling. pH gradient 3–11 IPG strips (Immobiline DryStrip pH 3–11 NL, 24 cm) were used for the protein separation in the first dimension. Samples were applied to the strips via cup loading and isoelectric focusing was performed on an IPGphorIII device. Second dimension gel electrophoresis was performed on DIGE gels (12.5%, 26 x 20 cm) using Ettan DALTsix electrophoresis units and DIGE buffers. Gels were scanned with a Typhoon 9410 fluorescence scanner and analyzed using DeCyder software v7.2 (spot detection parameters: estimated number of spots 10,000 and exclusion filter V < 30,000). For the analysis, both experiments were linked using the same master gel.

Data analysis Data of the linked experiment representing log-standardized abundances of spots present in at least 80% of spot maps (i.e., data of an individual gel picture) were extracted from Decyder v7.2 and used for further evaluation. This filter is suggested by the manufacturer (GE Healthcare) in order to exclude non-reproducible spots mostly representing artifacts. The data of the linked experiments were evaluated without considering the negative control sample (CHOf_1) to reduce the bias introduced by the spot map filter and thereby gain a better perception on differences between similar samples. To specifically determine the capability of the approach to differentiate a negative control, the whole dataset was reanalyzed later

6 ACS Paragon Plus Environment

Page 6 of 27

Page 7 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

including the negative control. Prior to any further analysis, the sample replicates (technical replicates) were averaged. Principal component analysis (PCA) Simca v13 software (Umetrics) was used for PCA. Euclidean distances between the PC scores of all analyzed preparations were calculated pairwise using a custom written script in R (www.r-project.org). Obtained Euclidean distances were scaled from 0–100, where a score of 0 indicates that analyzed samples are identical and a score of 100 indicates the strongest differing samples in the dataset. Results were obtained as a distance matrix for all pairwise comparisons (the obtained values are shown in supplementary data set S-1; 6). Fold pass rate (FPR) analysis The similarity fold pass rate (FPR) between two samples is defined as the fraction of spots for which the fold difference in standardized abundance is within a defined threshold (e.g., 1.5, 2, 3, 5 and 10-fold difference). FPR scores were calculated using a custom written script in R (www.r-project.org) and expressed as a percentage using the following formula: [%] =

(         ) (  )

* 100;

FPR score was calculated for each sample pair and results were summarized as FPR matrixes at the different threshold representing all pairwise comparisons (the obtained values are gathered in supplementary data set S-1; 1 and 2).

Results and discussion 2-D DIGE raw data The linked experiment without the negative control resulted in 62 spot maps, the extracted dataset included normalized log-standardized abundances of 1,657 protein spots. The PCA (Figure 1 and 2A) and FPR analysis (Figure 2B) were performed using this dataset. These 7 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

analyses are shown and described in detail further on. To exemplify the 2-D DIGE analysis, a gel image with CHOa_0 as analyte is shown as Figure S-1 in the supporting information. Transformation of 2-D DIGE data into numerical metrics of similarity Principal component analysis (PCA) The PCA model was able to describe 61% of the variation in the data using the first four principal components, 41% being explained by the first two components (PC1, PC2) showing a high complexity of data spots in the multidimensional space. The plot of PC1 and PC2 data is shown in Figure 1. Biological replicates representing three separate runs of an individual host cell protein preparation (CHOa_8_BR1–3, CHOe_1_BR1–3) are positioned close to each other. Host cell protein preparations derived from the same CHO cell line tend to cluster, with the exception of CHOa_7, which is positioned distant from other preparations of the same cell line. Between the cell lines larger distances are observed for CHOc and CHOd. To express the degree of similarity as numerical values, Euclidean distances between all host cell protein preparations were determined (0, samples are identical; 100, strongest differing sample). The resulting values are shown in a pairwise distance matrix together with a relationship dendrogram (Figure 2A). Fold pass rate (FPR) analysis The FPR score represents the percentage of spot pairs from two samples that are within an nfold difference interval in standardized abundance. To determine an optimal threshold n for further similarity evaluation, FPR scores were calculated at five different thresholds values: 1.5, 2, 3, 5 and 10. Results are presented in a Trellis plot (Figure 3) and a box-plot (Figure 4). In order to determine the most suitable FPR threshold, two aspects were considered: discriminatory power and robustness. The FPR threshold should be sensitive enough to detect differences between different host cell preparations and on the other hand it should be

8 ACS Paragon Plus Environment

Page 8 of 27

Page 9 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

sufficiently robust that technical, experimental and biological replicates are not detected as different unless there are true differences between them. The distribution of the obtained data suggests a condensation of the FPR scores towards 100% with increasing threshold levels (Figures 3, 4). This further suggests a low discrimination at high thresholds, i.e., 5- and 10fold. On the other hand, at a 1.5-fold threshold, biological replicates were recognized as different (Figure 3). Therefore, the 1.5-fold threshold criterion is too sensitive leading to nonrobust results. For the 2- or 3-fold threshold criterion the biological replicates appear as similar, while the FPR scores of the host cell preparations still cover a broad range (Figures 3 and 4). Based on these results, a threshold of 2- or 3-fold appears to be suitable; the 2-fold threshold was subsequently chosen as it represents the more sensitive choice to detect differences between host cell protein preparations. Similar thresholds were applied in other 2-D DIGE comparisons of HCP: two studies used a 1.5-fold threshold6,20 and one used a 2-fold threshold7. The resulting FPR scores calculated with a 2-fold threshold are shown in a pairwise matrix together with a relationship dendrogram (Figure 2B). Median FPR scores of each HCP preparation were further calculated (supplementary data set S-1; 2) and used for evaluation of the results. In principle, the FPR analysis could also be applied to other proteomic datasets to obtain simple numerical values describing the similarity of protein compositions analyzed, also by other analytical techniques, e.g. LC-MS. However, meaningful FPR thresholds would need to be evaluated newly, based on the discriminatory power of the chosen analytical approach. PCA vs FPR PCA (Figure 1, Figure 2A) and FPR analysis (Figure 2B) led to highly comparable results. This is reflected in similar heat-maps (Figure 2A/B) and dendrograms with similar clustering of results and differentiation of more distinct host cell protein preparations (CHOc_1–2; 9 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CHOd_1–2; CHOa_7). When correlating PCA Euclidean distance values with FPR similarity scores, a coefficient of determination of R2 = 0.92 (see supporting information Figure S-2) was determined, demonstrating a high degree of correlation of results. The main difference between the two approaches is that PCA calculates principal components from the standardized abundances of individual spots, whereas FPR counts the percentage of spots inside intervals determined by thresholds. Simple numerical values to describe the similarity of host cell protein preparations analyzed by 2-D DIGE can be obtained by both approaches. PCA distance scores are relative, meaning that the obtained values vary between experiments depending on the number and diversity of analyzed host cell protein preparations. FPR scores on the other hand are independent of the experimental design, meaning results from different experiments and projects can be compared directly. Therefore, we consider the FPR based approach more useful for the described purpose. However, the PCA derived PC plot (Figure 1) still represents a useful tool to visualize the relationship and distances between host cell protein preparations. Furthermore, PCA does not require meaningful a priori assumptions regarding the extent of differences as no threshold is set in advance as for the FPR analysis. Method variability, batch-to-batch variability, and inter-process differences Variability of the results can be evaluated at three different levels: a) Method variability: between technical replicates (each sample analyzed in triplicate), and between experimental replicates (CHOa_0 sample was analyzed in two experiments, CHOa_0_ER1–2); b) Batch-tobatch variability: between biological replicates (CHOa_8_BR1–3; CHOe_1_BR1–3); c) Interprocess differences: between HCP preparations originating from different processes (Table 1). Variability between technical replicates and experimental replicates was shown to be low with median FPR similarity scores of 98.5% and 98.3% as expected (supplementary data set S-1; 3 and 4). Also the variability between biological replicates, CHOa_8_BR1–3; CHOe_1_BR1–3, was low, with FPR similarity scores between 92–100% (Figure 2B). FPR similarity scores for 10 ACS Paragon Plus Environment

Page 10 of 27

Page 11 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

different processes ranged from 51%–100% (Figure 2B). It is apparent that inter-process variability (median FPR similarity score = 74.4% ± 12.0%, supplementary data set S-1; 2) is much larger than intra-process / batch-to-batch variability (median FPR similarity score = 95.7% ± 3.3% for CHOe_0, and 99.6% ± 0.2% for CHOa_8, supplementary data set S-1; 5). This clearly demonstrates that different upstream processes can lead to differences in the HCP composition at harvest and that these differences can be detected by the FPR approach. To determine the capability of the approach to differentiate a negative control, the whole dataset was reanalyzed including the negative control (see details in supplementary data set S1; 7). A significantly lower median FPR similarity score (36.4% for CHOf_1, supplementary data set S-1; 7) was obtained for the negative control CHOf_1 (originating form a distinctively different process), demonstrating the capability of the chosen approach to recognize significant differences. Overall, the data show the suitability of the approach to investigate differences between host cell protein preparations. Similarity analysis of HCP from different preparations HCP compositions of 15 different cultivation processes from 5 different CHO cell lines were investigated (see Table 1, excluding the negative control). Processes covered various upstream conditions, e.g., harvesting times ranged from day 9–17, viable cell density at harvest varied between 1.2e6–17.2e6 cells/mL, with viability at harvest between 44–99%. Related cultivation data are shown in Figure 5. Despite the wide range of different process conditions, cell lines, harvest dates, etc. a relatively high degree of similarity was found between host cell protein preparations, with a median similarity score of 74.4% ± 12.0% (2fold threshold FPR). The median FPR similarity score for the host cell protein preparations reaches values close to 100%, when applying 5 and 10 FPR thresholds, 97.4% ± 3.1% and 99.6% ± 1.1, respectively (see also Figure 4), which demonstrates that only few proteins largely change in abundance or are present in one cultivation but absent in another. Only 2.6% 11 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of the proteins are changed ≥ 5-fold and only 0.4% are changed ≥ 10-fold between all preparations. Differences in HCP composition between cultivation processes are still larger than differences caused by batch-to-batch variability (see above). But also differences in the HCP composition of preparations from diverse CHO cultivation processes are generally not very large. Furthermore, these differences are mostly quantitative (i.e., different amounts of individual HCP) and not qualitative (i.e., presence or absence of individual HCP). This finding is in line with previous reports7,10,20-22, which focused on the effect of defined upstream process changes, e.g., the use of different CHO cell lines, while the presented study aims at reflecting overall differences in HCP for a wide range of actual manufacturing processes. Nevertheless, more pronounced effects of certain process differences, e.g. as the use of different cell lines, were also observed in this study. PCA did not clearly distinguish between most cell lines (CHOa,b,c,e), but CHOd_(1-2) preparations are found distant from other analyzed preparations in the plot (Figure 1). Also the median of all FPR similarity scores for CHOd_1 (61.5%) and CHOd_2 (62.3%) are lower than the median overall FPR similarity score of all samples (74.4%), which was found statistically significant by a two-sample t-test (CHOd_1: t-value = -6.482 p-value = 0.000; CHOd_2: t-value= -7.808 p-value = 0.000). This means that the HCP compositions of CHOd_(1-2) preparations differ significantly stronger from the majority of analyzed host cell protein preparations than others. This could be caused by the cultivation characteristics of CHOd_(1–2), which are distinct from other CHO cell lines (Figure 5): Viable cell densities were generally lower compared to other cell lines, with a maximum reached at the day of harvest (Figure 5A). For all other cell lines, maximum viable cell density was reached a few days before harvest. Also CHOc_(1–2) appear distinct from other preparations on the PC plot (Figure 1). Even though this is less significantly reflected in the lower median FPR similarity scores obtained for the CHOc HCP preparations

12 ACS Paragon Plus Environment

Page 12 of 27

Page 13 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(CHOc_1 = 68.6%; CHOc_2 = 71.5%), when compared to the median overall FPR similarity score of all samples (74.4%; two sample t-test: CHOc_1: t-value = -3.054 p-value = 0.002, CHOc_2: t-value= -1.438 p-value = 0.081), this could also be related to the cultivation characteristics of CHOc. In both cases, a generally lower viable cell density compared to the other cell lines was observed during cultivation (Figure 5A), meaning also for CHOc, cultivation characteristics differed from the other cell lines. The more distinct HCP compositions of CHOc and CHOd cultivations cannot be explained by larger genetic differences of these cell lines. All CHO cell lines originate from the original CHO cell line (Puck et al, 1958) as shown in Figure 6; CHO 1 was subcloned from the original CHO cell line, for CHO 2 a mutation was introduced as selection marker. CHOa, c, d, e, f descend from CHO1, only CHOb descends from CHO 2, meaning genetically CHOb is expected to be the most distant one. If a clear effect of genealogy was present, one would rather expect the CHOb preparations to have the most distinct HCP composition, which was not observed. Thus different CHO cell lines could have an impact on the HCP, which is likely to be related to different cultivation characteristics of the cell lines, e.g., viable cell densities, growth rate, etc., and not only to genetic differences. Generally the result of this study agrees with the result of a previous study10, demonstrating that the use of different CHO cell lines has mostly a quantitative (i.e., different amounts of individual HCP) rather than a qualitative (i.e., presence or absence of individual HCP) effect on the HCP composition at harvest. Another HCP preparation that clearly differed from others was CHOa_7. Even though the same CHO cell line as for all eight CHOa cultivations was used, it is displaced on the PC plot (Figure 1). Also a low median FPR similarity score was determined (58.0%), which is significantly lower (two-sample t-test: t-value = -10.496, p-value = 0.000) than the median overall FPR similarity score of all samples (74.4%). The more different HCP preparation of

13 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

CHOa_7, when compared to the others, could result from multiple parameters. CHOa_7 was harvested at day 10 with a relatively low viability of 79%, instead of the usual harvest for CHOa at day 14 (Figure 5). Another reason for these differences could be the processing after harvest, because the cleared CHOa_7 supernatant was strongly concentrated (by a factor of ~28), diluted by diafiltration and finally sterile filtered. During the concentration step a high, intermediate HCP concentration (~30 mg/mL) was reached and the solution was reported to be turbid. For most other cases the total protein concentration during preparation remained between 0-10 mg/mL. Even though only a slight turbidity was reported after diluting the solution again, the HCP composition probably underwent some changes during downstream processing, as precipitation occurred. However, it is unclear if this led to the observed changes in the HCP composition. No further significant differences during cultivation or processing when compared to other HCP preparations, potentially explaining the more different HCP preparation CHOa_7, could be identified. When analyzing the negative control (CHOf_1, see supplementary data set S-1; 7) a strikingly low, median FPR similarity score of 36.4% was determined. This can easily be explained by the very different processing: repeated batch fermentation instead of fed batch fermentation; different cell line (CHOf); additional processing step (ion exchange chromatography) after harvest; etc. These changes are expected to have a critical impact on the HCP composition, justifying the low FPR similarity score obtained for this preparation. Taken together, these results show that certain process changes, e.g., the use of different CHO cell lines can lead to differences in HCP composition. However these differences were mostly small and of quantitative (different abundancies of the same HCP), not qualitative nature (presence/absence of different HCP). Overall a high degree of similarity was observed for the very diverse set of CHO harvest supernatants. In addition the approach was capable to recognize very different HCP compositions, e.g., as expected for the negative control.

14 ACS Paragon Plus Environment

Page 14 of 27

Page 15 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Conclusions 2-D DIGE proved to be a powerful tool to characterize and compare HCP preparations from diverse cultivation process. By using the FPR for similarity scoring it was possible to convert the 2-D DIGE data into a simple numerical measurement of percent similarity between preparations. The approach was found to precisely, reproducibly and robustly recognize differences in HCP compositions. Overall, a high degree of similarity was found for HCP preparations from a variety of cultivation processes using different CHO cell lines, which is in agreement with other publications10,20. Minor variations in CHO HCP compositions between different cultivation processes generally supports the notion of using a platform HCP immunoassay for multiple biopharmaceutical products10,20. However, the present study also demonstrates that certain process changes, e.g., use of CHO cell lines with particular growth profiles, can lead to more pronounced changes in HCP, mostly of quantitative (i.e., different amounts of individual HCP) and less qualitative (i.e., presence or absence of individual HCP) nature. This has to be taken into consideration when using a platform HCP assay, which requires careful case-to-case evaluation. The presented methodology can be used for a datadriven decision on the suitability of a platform HCP assay standard for a new product, but also as a general tool to compare HCP compositions of different cultivation processes.

Acknowledgement We thank Matej Birk (Novartis Pharma AG, Mengeš, Slovenia), who performed the 2-D DIGE analysis. We also thank Joel Tapparel (Novartis Pharma AG, Basel, Switzerland) for the preparation of several host cell protein preparations included in the report.

15 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Conflict of interest disclosure

All the authors are employed in a company belonging to the Novartis group; Darja Obrstar, Boštjan Japelj and Lea Bojić are employed by Novartis Pharma AG, Mengeš, Slovenia and Frieder Kroener and Oliver Anderka, are employed by Novartis Pharma AG, Basel, Switzerland.

Supporting information Supporting information is compiled in Supporting Information for Publication.docx. Supporting information includes Supplementary data set S-1, Figure S-1 and Figure S-2. The Supplementary data set S-1 is embedded as Supplementary data set S1.xlsx, with the FPR calculations for individual values (Sheet 1), averaged technical, experimental and biological replicates (Sheets 2-4), variability of biological replicates (Sheet 5), PCA scores (Sheet 6), analysis with included negative control samples (Sheet 7), and negative control cultivation course (Sheet 8). In Figure S-1 the 2-D DIGE separation is exemplified showing the gel image of CHOa_O. In Figure S-2 the correlation between PCA distances and FPR scores is shown.

16 ACS Paragon Plus Environment

Page 16 of 27

Page 17 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

References (1) Liu, H. F.; Ma, J.; Winter, C.; Bayer, R. MAbs 2010, 2, 480-499. (2) de Zafra, C. L.; Quarmby, V.; Francissen, K.; Vanderlaan, M.; Zhu-Shimoni, J. Biotechnol Bioeng 2015, 112, 2284-2291. (3) FDA. Journal of immunotherapy (Hagerstown, Md. : 1997) 1997, 20, 214-243. (4) Zhu-Shimoni, J.; Yu, C.; Nishihara, J.; Wong, R. M.; Gunawan, F.; Lin, M.; Krawitz, D.; Liu, P.; Sandoval, W.; Vanderlaan, M. Biotechnol Bioeng 2014, 111, 2367-2379. (5) U.S. Pharmacop. (USP38-NF33 2S). U.S. Pharmacop. PF40 (4).2014. (6) Grzeskowiak, J. K.; Tscheliessnig, A.; Toh, P. C.; Chusainow, J.; Lee, Y. Y.; Wong, N.; Jungbauer, A. Protein Expr Purif 2009, 66, 58-65. (7) Jin, M.; Szapiel, N.; Zhang, J.; Hickey, J.; Ghose, S. Biotechnol Bioeng 2010, 105, 306316. (8) Tait, A. S.; Hogwood, C. E.; Smales, C. M.; Bracewell, D. G. Biotechnol Bioeng 2012, 109, 971-982. (9) Valente, K. N.; Schaefer, A. K.; Kempton, H. R.; Lenhoff, A. M.; Lee, K. H. Biotechnol J 2014, 9, 87-99. (10) Yuk, I. H.; Nishihara, J.; Walker, D., Jr.; Huang, E.; Gunawan, F.; Subramanian, J.; Pynn, A. F.; Yu, X. C.; Zhu-Shimoni, J.; Vanderlaan, M.; Krawitz, D. C. Biotechnol Bioeng 2015, 112, 2068-2083. (11) Arentz, G.; Weiland, F.; Oehler, M. K.; Hoffmann, P. Proteomics Clin Appl 2015, 9, 277-288. (12) Paul, D.; Kumar, A.; Gajbhiye, A.; Santra, M. K.; Srikanth, R. Biomed Res Int 2013, 2013, 783131. (13) Silberring, J.; Ciborowski, P. Trends Analyt Chem 2010, 29, 128. (14) Sriharshan, A.; Boldt, K.; Sarioglu, H.; Barjaktarovic, Z.; Azimzadeh, O.; Hieber, L.; Zitzelsberger, H.; Ueffing, M.; Atkinson, M. J.; Tapio, S. J Proteomics 2012, 75, 2319-2330. (15) Thon, J. N.; Schubert, P.; Duguay, M.; Serrano, K.; Lin, S.; Kast, J.; Devine, D. V. Transfusion 2008, 48, 425-435. (16) Wu, W. W.; Wang, G.; Baek, S. J.; Shen, R. F. J Proteome Res 2006, 5, 651-658. (17) Alban, A.; David, S. O.; Bjorkesten, L.; Andersson, C.; Sloge, E.; Lewis, S.; Currie, I. Proteomics 2003, 3, 36-44. (18) Meleady, P.; Doolan, P.; Henry, M.; Barron, N.; Keenan, J.; O'Sullivan, F.; Clarke, C.; Gammell, P.; Melville, M. W.; Leonard, M.; Clynes, M. BMC Biotechnol 2011, 11, 78. (19) Unlu, M.; Morgan, M. E.; Minden, J. S. Electrophoresis 1997, 18, 2071-2077. (20) Krawitz, D. C.; Forrest, W.; Moreno, G. T.; Kittleson, J.; Champion, K. M. Proteomics 2006, 6, 94-110. (21) Champion, K. M.; Nishihara, J. C.; Joly, J. C.; Arnott, D. Proteomics 2001, 1, 11331148. (22) Obrstar, D.; Mandelc, S.; Stojkovic, S.; Francky, A.; Bojic, L.; Javornik, B. J Biotechnol 2016, 219, 98-109.

17 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table legends Table 1: Cultivation of different CHO cell lines Six CHO cell lines (CHOa, -b, -c, -d, -e, -f) were mock transfected with an expression vector lacking the gene of interest of the respective production strain. These were then cultivated in the related product-specific processes (process 1–15), optimized for the respective production strain.

Figure legends Figure 1: PC plot of analyzed host cell protein preparations PC score plot showing PCA results for the different analyzed host cell protein preparations. Each dot has been labelled with its corresponding preparation (arrows have been added to mark the closely clustered dots). Figure 2: Pairwise comparison of host cell protein preparations A) Pairwise normalized Euclidean distance between PC scores (PCA) with labelled distances and B) pairwise mAb 2-fold pass rate (FPR) comparison, with the related dendrogram. In both cases, FPR scores obtained for the three technical replicates of the respective host cell protein preparation were averaged. A) Color gradient from low (orange) to high distance scores (blue) for the PCA. B) Color gradient from high similarity scores (orange) to low similarity scores (blue) for the % FPR analysis. Figure 3: FPR distribution of experimental- / biological replicates and host cell preparations at different threshold levels Each strip of the Trellis plot for thresholds of 1.5, 2, 3, 5 and 10 shows the FPR grouped in biological replicates (BR), experimental replicates (ER) or host cell protein preparations (Preparations). FPR scores (from strictly lower triangular matrix) were averaged from the 18 ACS Paragon Plus Environment

Page 18 of 27

Page 19 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

three technical replicates performed for each sample. Vertical grey lines show the ranges for BRs, ERs and Preparations. The small grey dots represent median values for the individual groups. Figure 4: Overall %-similarity scores at different thresholds for FPR. Distributions of FPR similarity scores for thresholds of 1.5, 2, 3, 5 and 10 are represented in a box-plot. Used similarity scores (from strictly lower triangular matrix) were averaged from the three technical replicates performed for each sample. The box-plot shows median similarity, upper and lower quartile, and the whiskers, for the used thresholds. To not bias the results, biological replicates of CHOa_8 and CHOe_1, and experimental replicates (CHOa_0) were considered as one preparation by taking the median FPR similarity score of these replicates. Figure 5: Cultivation courses Cultivation courses of all CHO cultivations, applied for the production of all analyzed host cell protein preparations, are presented by, A: the course of viable cell density over time, and B: the course of %-viability over time. Graphs for biological replicates BR1-BR3 of CHOa_8 and CHOe_1 show the average course of all three replicates. Figure 6: Genetic relationship of CHO cell lines. The phylogenetic tree of the different CHO cell lines shows the genetic relationship and the link to the original CHO cell line.

19 ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 1: Cultivation of different CHO cell lines Host cell protein Process CHO cell line preparation CHOa_0* 0 ** CHOa CHOa_1–7 1–7 CHOa_8_BR1–3 8 CHOb_1 9 CHOb CHOc_1 10 CHOc CHOc_2 11 CHOd_1 12 CHOd CHOd_2 13 CHOe_1_BR1–3 14 CHOe CHOf_1*** 15 CHOf * parental cell line ** early version of Novartis’ upstream platform process *** negative control

20 ACS Paragon Plus Environment

Page 20 of 27

Page 21 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Table of Contents graphic 84x47mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: PC plot of analyzed host cell protein preparations 83x51mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 22 of 27

Page 23 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2: Pairwise comparison of host cell protein preparations 159x225mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3: FPR distribution of experimental- / biological replicates and host cell preparations at different threshold levels 83x33mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 24 of 27

Page 25 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4: Overall %-similarity scores at different thresholds for FPR. 83x49mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5: Cultivation courses 177x59mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 26 of 27

Page 27 of 27 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 6: Genetic relationship of CHO cell lines. 83x46mm (300 x 300 DPI)

ACS Paragon Plus Environment