Email updates

Keep up to date with the latest news and content from Breast Cancer Research and BioMed Central.

Open Access Highly Accessed Research article

Cigarette smoking, cytochrome P4501A1 polymorphisms, and breast cancer among African-American and white women

Yu Li1, Robert C Millikan1*, Douglas A Bell2, Lisa Cui1, Chiu-Kit J Tse1, Beth Newman3 and Kathleen Conway1

Author Affiliations

1 Department of Epidemiology, School of Public Heath and Lineberger Comprehensive Cancer Center, School of Medicine, University of North Carolina, Chapel Hill, North Carolina, USA

2 National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA

3 School of Public Health, Queensland University of Technology, Kelvin Grove, Australia

For all author emails, please log on.

Breast Cancer Res 2004, 6:R460-R473  doi:10.1186/bcr814

The electronic version of this article is the complete one and can be found online at:

Received:16 March 2004
Revisions received:6 May 2004
Accepted:14 May 2004
Published:15 June 2004

© 2004 Li et al.; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.



Previous epidemiologic studies suggest that women with variant cytochrome P4501A1 (CYP1A1) genotypes who smoke cigarettes are at increased risk for breast cancer.


We evaluated the association of breast cancer with CYP1A1 polymorphisms and cigarette smoking in a population-based, case–control study of invasive breast cancer in North Carolina. The study population consisted of 688 cases (271 African Americans and 417 whites) and 702 controls (285 African Americans and 417 whites). Four polymorphisms in CYP1A1 were genotyped using PCR/restriction fragment length polymorphism analysis: M1 (also known as CYP1A1*2A), M2 (CYP1A1*2C), M3 (CYP1A1*3), and M4 (CYP1A1*4)


No associations were observed for CYP1A1 variant alleles and breast cancer, ignoring smoking. Among women who smoked for longer than 20 years, a modest positive association was found among women with one or more M1 alleles (odds ratio [OR] = 2.1, 95% confidence interval [CI] = 1.2–3.5) but not among women with non-M1 alleles (OR = 1.2, 95% CI = 0.9–1.6). Odds ratios for smoking longer than 20 years were higher among African-American women with one or more M3 alleles (OR = 2.5, 95% CI = 0.9–7.1) compared with women with non-M3 alleles (OR = 1.3, 95% CI = 0.8–2.2). ORs for smoking in white women did not differ appreciably based upon M2 or M4 genotypes.


Cigarette smoking increases breast cancer risk in women with CYP1A1 M1 variant genotypes and in African-American women with CYP1A1 M3 variant genotypes, but the modifying effects of the CYP1A1 genotype are quite weak.

African Americans; breast cancer; cytochrome P4501A1


Cigarette smoking is a major route of exposure to many potential human carcinogens. Cigarette smoking has been associated with increased risk of breast cancer in some epidemiologic studies, but many studies showed no effect or an inverse association [1,2]. The most consistent finding appears to be a weak positive association following a long duration of smoking [3]. In an attempt to resolve these inconsistencies, recent epidemiologic studies have focused on interactions between cigarette smoking and genetic factors involved in the metabolism of tobacco-related carcinogens [4]. If interactions are observed between smoking and genes involved in the metabolism of specific compounds, a stronger case can be made that associations between smoking and breast cancer are causal and not due entirely to chance or to bias.

The cytochrome P4501A1 (CYP1A1) gene encodes an enzyme with aryl hydrocarbon hydroxylase activity. Formation of aryl epoxides by aryl hydrocarbon hydroxylase is the first step in the metabolism of polycyclic aromatic hydrocarbons from cigarette smoke [5]. The activity of aryl hydrocarbon hydroxylase encoded by the CYP1A1 gene has been observed in both normal and neoplastic human breast epithelium [6,7]. Some studies suggest that heterocyclic amines are activated by CYP1A1 via N-hydroxylation in breast tissue [8]. Four common polymorphisms of the CYP1A1 gene have been identified: M1, a T → C substitution at nucleotide 3801, giving rise to a MspI restriction site in the 3'-noncoding region [9]; M2, nucleotide 2455 A → G, resulting in an amino acid change at codon 462 of isoleucine to valine within the heme-binding domain of exon 7 [10]; M3, nucleotide 3205 T → C, creating a MspI restriction fragment length polymorphism (RFLP) in the 3'-noncoding region [11]; and M4, nucleotide 2453 C → A, resulting in an amino acid substitution at codon 461 of threonine to asparagine [12].

The functional significance of variant CYP1A1 genotypes is unclear. Studies of CYP1A1 in cultured human lymphocytes showed significantly elevated levels of inducible enzyme activity among M2 genotypes compared with the wild-type genotype [13-15]. Crofts and colleagues [15] reported that M2 alleles appeared to be associated with CYP1A1 inducibility at the level of transcription followed by threefold elevation in aryl hydrocarbon hydroxylase enzyme activity. The M1 allele was also reported more readily inducible than the CYP1A1 wild-type allele [14,16,17].

Several epidemiologic studies evaluated the relationship between cigarette smoking, CYP1A1 polymorphisms, and breast cancer risk [18-22]. Some studies reported that M1 and M2 variants increase risk of breast cancer [18,21], while other studies did not observe main effects for CYP1A1 variants [19,21]. Joint effects of smoking and CYP1A1 variants on breast cancer risk have been reported in two studies [18,22].

In a population-based, case–control study of African-American and white women in North Carolina, we examined the association of the CYP1A1 genotypes and cigarette smoking and breast cancer. We hypothesized that women with CYP1A1 variant alleles and high levels of smoking exposure (longer duration, higher dose, earlier age at initiation) as well as exposure to environmental tobacco smoke (ETS) would be at increased risk for breast cancer.

Materials and methods

Study design and participants

The Carolina Breast Cancer Study (CBCS) is a population-based, case–control study of breast cancer in African-American and white women in North Carolina [23,24]. The cases were women with a first diagnosis of invasive breast cancer identified through a rapid ascertainment system implemented in cooperation with the North Carolina Central Cancer Registry. The coverage rate of the Central Cancer Registry for incident breast cancer cases was 97% [23]. Controls were selected from lists provided by the North Carolina Division of Motor Vehicles for women younger than 65 years old, and from records of the US Health Care Financing Administration for women 65–74 years of age. Coverage rates for the underlying North Carolina population were 96% for the Division of Motor Vehicles list and 93% for the Health Care Financing Administration list [23].

The CBCS was conducted in two phases: phase 1 (1993–1996) and phase 2 (1996–2001). The present analysis is based upon phase 1 participants, where the response rates were 76% for cases and 55% for controls. Detailed methods have been reported previously [23]. Interview data included reproductive history, lifestyle factors, a detailed family history, medical history, and occupational history. Approximately 98% of participants who were interviewed agreed to give a 30 ml blood sample at the time of interview. Informed consent to obtain genomic DNA from the blood was sought using a form approved by the Institutional Review Board of the University of North Carolina, School of Medicine and in compliance with the Helsinki Declaration. For the present study, we genotyped CYP1A1 on the first 688 breast cancer cases and 702 controls enrolled in phase 1 of the CBCS. Due to financial constraints, we were unable to genotype all study participants. The genotyped participants correspond to 80% of the 861 total cases and 89% of the 790 total controls enrolled in phase 1 of the CBCS.

Laboratory methods

Genomic DNA was extracted from peripheral blood leukocytes using an automated DNA extractor (Applied Biosystems, Foster City, CA, USA) and was stored at 4°C. PCR-RFLP assays were designed to detect each of the variant CYP1A1 alleles. Several systematic nomenclatures for CYP1A1 have been proposed [25]. Wild-type CYP1A1 has been referred to as CYP1A1*1 or CYP1A1*1A, M1 as CYP1A1*2 or CYP1A1*2A, M2 as CYP1A1*2B (for the combination of M1 + M2) or CYP1A1*2C (for M2 only), M3 as CYP1A1*3, and M4 as CYP1A1*4. To avoid confusion, we have retained the M1–M4 designations, as recently adopted by Bartsch and colleagues [25]. The variant alleles M2 and M4 lose BsaI and BsrDI restricted sites at nucleotides 4889 and 4887, respectively. An internal control for the completeness of digestion was created by introducing a restriction-enzyme site in the PCR primers designed for genotyping of M2 and M4 alleles. The variant alleles M1 and M3 contain MspI restricted sites at nucleotides 6235 and 5639, respectively. These restricted sites do not exist on the wild-type allele.

PCR primers used to determine M2 and M4 alleles include the forward primer 5'-CTGTCTCCCTCTGGTTACAGGAAGC-3', which contains a BsrDI site, and the reverse primer 5'-TTCCACCCGTTGCAGCAGGATAGCC-3', which contains a BsaI site. PCR began with a 25 μl reaction containing 50 ng genomic DNA, 0.3 μmol forward and reverse primers, 1.25 U Taq DNA polymerase (Perkin-Elmer, Foster City, CA, USA) and 1 × PCR buffer (50 mM KCl, 10 mM Tris–HCl [pH 9.0], 1.5 mM MgCl2, and 0.1% Triton X-100). After samples were heated to 80°C, 50 μmol each of dATP, dGTP, dCTP, and dTTP (Pharmacia, Piscataway, NJ, USA) were added. PCR was carried out in a thermocycler 4800 (Perkin-Elmer Cetus, Norwalk, CT, USA). Amplification conditions included 95°C for 5 min, 62°C for 2 min, and 72°C for 2 min for the first cycle; 95°C for 1 min, 62°C for 2 min, and 72°C for 2 min for the following 33 cycles; and 95°C for 1 min and 72°C for 10 min for the final cycle.

PCR products of length 214 bp were produced after amplification and 4 μl PCR products were subjected to BsrDI (for M2) or BsaI (for M4) digestion (New England Biolabs, Beverly, MA, USA) at 37°C in a total volume of 13 μl. PCR products from the wild-type allele was digested and separated on a 15% polyacrylamide gel with bands of 149 bp and 55 bp fragments. M2 or M4 variant alleles were undigested at polymorphic sites, and appeared as a band of 206 bp on the electrophoresis. The fragment of 214 bp represented incompletely digested PCR products, which could be distinguished from the band for M2 or M4 variant alleles on a 15% polyacrylamide gel.

PCR primers used to determine M1 and M3 alleles included the forward primer 5'-GGCTGAGCAATCTGACCCTA-3' and the reverse primer 5'-GGCCCCAACTACTCAGAGGCT-3'. Reaction components were the same as for the PCR for genotyping of M2 and M4 alleles. Amplification conditions included: 95°C for 5 min, 64°C for 2 min, and 75°C for 2 min for the first cycle; 95°C for 1 min, 64°C for 2 min, and 75°C for 2 min for the following 33 cycles; and 95°C for 1 min and 72°C for 10 min for the final cycle. PCR products of length 739 bp were produced and 8 μl PCR products were subject to MspI/SphI digestion in a total volume of 25 μl. After 15% polyacrylamide gel electrophoreses, the separated bands of size 408 bp and 362 bp represented the wild-type and M1 alleles, respectively; and the bands of size 331 bp and 226 bp were determined as the wild-type and M3 alleles, respectively.

The introduced restriction-enzyme sites within the primers were designed as an internal control in order to assess the completeness of enzymatic digestion. Genotyping results were determined by two independent readers. Readers and laboratory personnel were blinded to the case–control status and other participant characteristics. When interpreting the results, the two readers were unaware of each other other's interpretations. All discrepancies in genotyping results between readers were then resolved through group discussion, and agreement was achieved on all samples with discordant genotyping results. Repeat genotyping for 5% of samples (70 subjects) randomly selected from study subjects was performed to evaluate reproducibility. The results were 100% concordant for the repeat samples. Positive controls were also used for genotyping at each CYP1A1 locus.

Statistical analysis

Genotype frequencies were compared in cases versus controls using a chi-square test. When cell sizes were less than five, Fisher's exact test was used. In order to partially address the potential for linkage disequilibrium among CYP1A1 alleles, we cross-classified participants according to combinations of wild-type and variant alleles at each locus. Haplotypes (specific chromosomal combinations of CYP1A1 alleles) cannot be identified directly using PCR-RFLP techniques. For example, gel patterns cannot distinguish cis versus trans relationships for M1 and M3 alleles. CYP1A1 haplotype frequencies were thus estimated using the EH algorithm [26]. Haplotype frequencies in cases and controls were compared using a likelihood ratio test, as implemented in the EH algorithm.

Information concerning exposure to tobacco smoke was obtained from inperson interviews. The reference date was the date of diagnosis for cases and the date of selection for controls. To calculate the odds ratios (ORs) for active smoking, women who smoked fewer than 100 cigarettes in their lifetime constituted the referent group (nonsmokers). Women who never actively smoked but who had lived with a smoker after the age of 18 years were classified as exposed to ETS. The duration of active smoking was calculated by asking 'Keeping in mind that you may have stopped and started several times, overall how many years did you smoke regularly?' The dose of smoking was calculated in packs per day (20 cigarettes to a pack).

The ORs for active smoking and the CYP1A1 genotype did not differ when women with exposure to ETS were excluded from the referent group. Thus, to increase sample size, ORs for active smoking were calculated including women with ETS exposure in the referent group. For analyses of smoking and the CYP1A1 genotype on a multiplicative scale, analyses of ETS were conducted among women who never smoked themselves, using women exposed to neither active smoking nor ETS as a referent group [24]. Smoking information was missing for 10 cases and for seven controls.

The ORs and 95% confidence intervals (CIs) were used to measure associations between CYP1A1 genotypes and breast cancer and between smoking status and breast cancer, using unconditional logistic regression models. PROC GENMOD of the software package SAS (version 8.1; SAS Institute, Cary, NC, USA) was used to incorporate offsets derived from sampling probabilities used to identify eligible participants. Covariates included age, race (white/African American), age at menarche (≥ 12 years or < 12 years), parity (nulliparous, 1 and ≥ 2), family history of breast cancer (yes/no), benign breast biopsy (yes/no), and alcohol consumption (yes/no). ORs did not differ after adjusting for additional covariates, so results are presented adjusting for sampling fractions, age, and race (when appropriate).

Analyses of smoking effects according to CYP1A1 genotypes were conducted using the categories smoking status (current, former), duration, dose, time since cessation, and age at initiation. These categories and cut points were created in previous analyses that ignored the CYP1A1 genotype [24], and they represent aspects of smoking exposure that showed the strongest associations with breast cancer in the entire dataset. Stratified analyses were performed based on menopausal status, since previous analyses showed stronger effects for smoking in postmenopausal compared with premenopausal women [24]. Women were classified as postmenopausal if they had undergone natural menopause, bilateral oophorectomy, or irradiation to the ovaries. Natural menopause was defined as the cessation of regular (or approximately monthly) menstrual cycles. Women in the transition (perimenopausal) period were classified as postmenopausal. For women aged 50 years or older, postmenopausal status was assigned to those who had not stopped cycling but were taking hormone replacement therapy.

To assess interaction on a multiplicative scale, ORs for smoking were estimated across strata of CYP1Al genotypes, and separate logistic models with interaction terms between smoking and the CYP1Al genotypes were analyzed. To estimate independent and joint effects of cigarette smoking and the CYP1Al genotype on an additive scale, indicator variables were created for each category of joint exposure of smoking variable and the CYP1Al genotype (variant allele/variant allele or variant allele/wild-type allele). Women with the homozygous wild-type allele genotypes (wild-type allele/wild-type allele) and the lowest level of exposure to smoking variable were used as a common reference group.

Interaction contrast ratios (ICR) and CIs were calculated as described by Hosmer and Lemeshow [27] for the joint effects of the M1 locus and smoking. Data were too sparse to calculate ICRs for the remaining CYP1A1 loci. The ICR was calculated using the following formula [27]: ICR = OR11 - OR10 - OR01 + 1, where OR11 is the odds ratio for participants with smoking exposure and variant-containing CYP1A1 genotype, OR10 is the odds ratio for the variant CYP1A1 genotype among those unexposed to smoking, and OR01 is the odds ratio for smoking exposure among those with the nonvariant CYP1A1 genotype.

ICRs greater than zero imply greater than additive effects of smoking and the CYP1A1 genotype (synergy), ICRs of zero imply additive effects (no interaction on an additive scale), and ICRs less than zero imply less than additive effects (antagonism). The 95% CIs for ICRs that do not contain zero can be interpreted as statistically significant at an alpha level of 0.05.


Characteristics of breast cancer cases and controls, and ORs for breast cancer and smoking as well as other risk factors, have been published previously for the CBCS [24]. Briefly, no association was observed between current smoking and breast cancer. Smoking of 20 years or longer duration and cessation of smoking within 3 years of the reference date among former smokers showed modest associations with breast cancer [24]. These associations were stronger among postmenopausal compared with premenopausal women. The ORs for smoking were similar in African-American and white women.

Genotype frequencies for CYP1A1 M1, M2, M3 and M4 are presented in Table 1. M1-containing genotypes were found in both African-American and white women, and at a slightly higher frequency among the African-American women. M2-containing and M4-containing genotypes were more common in white women, and M3-containing genotypes more common in African-American women. Genotype frequencies for CYP1A1 variants in our study population were similar to those in previous studies [18,20,21]. The CYP1A1 haplotype frequencies were estimated for the loci encoding M1 and M3 in African-American women, and those encoding M1, M2 and M4 in white women. Haplotype frequencies did not differ between cases and controls in African-American women (P = 0.9) and in white women (P = 0.95). Among African-American women, haplotype frequencies were wt + wt (67%), M1 + wt (23%), wt + M3 (10%) and M1 + M3 (< 0.01%). Among white women, haplotype frequencies were wt + wt + wt (83%), M1 + wt + wt (8%), wt + wt + M4 (4%), M1 + M2 + wt (4%), and the remaining combinations of alleles (total of 1%).

Table 1. Genotype frequencies for CYP1A1 among African-American and white participants in the Carolina Breast Cancer Study (1993–1996)

The ORs for CYP1A1 genotypes and breast cancer are presented in Table 2 for African-American and white women. Although a slight positive association was observed among African-American women for the M1/M1 genotype and breast cancer, and among white women for the M2/M2 genotype and breast cancer, the results are not statistically significant. The remaining ORs were close to 1.0, with the exception of an inverse association for M4 alleles in African-American women that was very imprecise due to small numbers. The ORs for CYP1A1 genotypes in premenopausal and postmenopausal women are presented in Table 3 and were also close to 1.0. ORs were unchanged after adjustment for family history, reproductive history, alcohol, smoking, or other breast cancer risk factors (data not shown).

We calculated the ORs for combined CYP1A1 alleles, but the results were imprecise due to small numbers. Among African-American women, using wild-type homozygotes for the M1 and M3 loci as a referent group, the ORs were 0.8 (95% CI = 0.5–1.4) for wt/wt (M1 locus) + any M3 (M3 locus), 0.9 (95% CI = 0.6–1.3) for any M1 (M1 locus) + wt/wt (M3 locus), and 1.4 (95% CI = 0.6–3.0) for any M1 (M1 locus) + any M3 (M3 locus). Among white women, using wild-type homozygotes for the M1, M2 and M4 loci as a referent group, the ORs were close to 1.0 for all combinations of M1, M2 and M4 alleles.

Table 2. Odds ratios (ORs) for breast cancer in relation to the CYP1A1 genotypes, in African-American and white women

Table 3. Odds ratios (ORs) for breast cancer in relation to CYP1A1 genotypes, according to menopausal status (African-American and white women combined)

The ORs for smoking and breast cancer in relation to CYP1A1 M1 genotypes are presented in Table 4. Among current smokers, the ORs were close to 1.0 for women with M1 (wt/M1 or M1/M1) as well as wild-type genotypes. ORs for former smoking, for smoking for longer than 20 years, for cessation of smoking within 10 years prior to the reference date, for initiation of smoking before age 18, and for exposure to ETS were higher among women with M1 genotypes compared with wild-type genotypes. The ORs were elevated further among carriers of one or more copies of the M1 allele for former smoking and for smoking cessation within 10 years in postmenopausal women, and for initiation of smoking prior to age 18 and for ETS exposure in premenopausal women. The ORs for the number of cigarettes per day did not show a dose–response relationship.

Table 4. Odds ratios (ORs) for breast cancer and smoking in relation to the CYP1A1 genotype at the M1 locus, according to menopausal status (White and African-American women combined)

Since the allelic frequencies for M2 were low in African-American women, the ORs for smoking and breast cancer according to M2 genotype are presented for white women only in Table 5. ORs for ETS exposure are not calculated for M2, M3 or M4 genotypes due to sparse data. ORs for active smoking did not differ appreciably according to the M2 genotype. Results stratified by menopausal status were very imprecise due to small numbers, but suggested a stronger effect of genotype in premenopausal women. In premenopausal women, the ORs for smoking for longer than 20 years were 3.6 (95% CI = 0.3–40.8) for women with M2 genotypes and 1.3 (95% CI = 0.7–2.4) for wild-type genotypes. The corresponding ORs for postmenopausal women were 0.9 (95% CI = 0.2–4.0) and 1.3 (95% CI = 0.8–2.0)

Table 5. Odds ratios (ORs) for breast cancer and smoking in relation to the CYP1A1 genotype at the M2 locus (white women only)

The ORs for smoking and breast cancer according to the M3 genotype in African-American women only are presented in Table 6, due to the low frequency of the M3 allele in white women. Cessation of smoking within 10 years was more strongly associated with breast cancer in women with M3 genotypes than with wild-type genotypes. ORs for smoking for longer than 20 years and for initiation of smoking before age 18 were higher in women with M3 genotypes than with wild-type genotypes. The ORs for the number of cigarettes smoked per day did not differ according to M3 genotype. Results stratified on menopausal status were very imprecise. In premenopausal women, the OR for initiation of smoking before age 18 was 7.3 (95% CI = 0.7–73.0) for women with M3 genotypes and was 1.6 (95% CI = 0.7–3.5) for wild-type genotypes. In postmenopausal women, the OR for smoking for longer than 20 years was 3.0 (95% CI = 0.9–10.7) for women with M3 genotypes and was 1.3 (95% CI = 0.7–2.5) for wild-type genotypes.

Table 6. Odds ratios (ORs) for breast cancer and smoking in relation to the CYP1A1 genotype at the M3 locus (African-American women only)

The ORs for smoking and breast cancer according to the M4 genotype in white women are presented in Table 7, due to the lower frequency of the M4 allele in African-American women. The ORs were imprecise due to the low frequency of the M4 allele, and did not differ appreciably by M4 genotype. A suggestion of an inverse association with long duration of smoking was observed for women with M4 genotypes while a weak positive association was observed for wild-type genotypes.

Table 7. Odds ratios (ORs) for breast cancer and smoking in relation to the CYP1A1 genotype at the M4 locus (white women only)

Joint effects of smoking and the M1 genotype are presented on an additive scale in Table 8. An indication of greater than additive joint effects was observed for M1 genotypes and former smoking, number of cigarettes smoked, duration of smoking, cessation of smoking within 10 years, and initiation of smoking before age 18. The ICRs were highest for early age at initiation of smoking in premenopausal women, and for cessation of smoking with 10 years in both premenopausal and postmenopausal women. However, the interactions among these categories of smoking exposure and CYP1A1 M1 genotype were not statistically significant.

Table 8. Odds ratios (ORs) for breast cancer according to cigarette smoking and the CYP1A1 genotype at the M1 locus, according to menopausal status (African-American and white women combined), additive scale and interaction contrast ratios

ORs were calculated for smoking according to combined genotypes at the M1 and M3 loci in African-American women, and at the M1, M2 and M4 loci in white women. Results were imprecise due to small numbers. In African-American women, the ORs for former smoking were slightly higher for women with any M1 + wt/wt M3 genotypes (OR = 1.9, 95% CI = 0.9–3.8) and wt/wt M1 + any M3 genotypes (OR = 2.0, 95% CI = 0.6–6.2) than for participants with wt/wt M1 + wt/wt M3 genotypes (OR = 1.6, 95% CI = 0.8–3.0). Data were too sparse to calculate ORs among participants with combined any M1 and any M3 genotypes or to evaluate the effects of dose and duration of smoking. Among white women, the ORs for former smoking were elevated among participants with any M1 + wt/wt M2 + wt/wt M4 genotypes (OR = 3.3, 95% CI = 1.3–8.3), but not wt/wt M1 + wt/wt M2 + any M4 genotypes (OR = 1.4, 95% CI = 0.4–5.3) or wt/wt M1 + wt/wt M2 + wt/wt M4 genotypes (OR = 1.0, 95% CI = 0.7–1.5). Data were too sparse to calculate ORs for former smoking in women with other combinations of M1, M2 and M4 genotypes.


We conducted a population-based, case–control study of invasive breast cancer in relation to cigarette smoking and CYP1A1 polymorphisms in African-American and white women in North Carolina. ORs for all four CYP1A1 variants (M1, M2, M3 and M4) were close to the null value in each subgroup examined (African-American and white women, premenopausal and postmenopausal women). Two previous studies reported positive associations between CYP1A1 variants and breast cancer. Taioli and colleagues [21] found a moderate to strong association for M1 genotypes among African-American women. A weak positive association for M2 genotypes among Caucasians was reported by Ambrosone and colleagues [18]. Two other studies, by Ishibe and colleagues [22] and by Bailey and colleagues [20], did not find an association between CYP1A1 variants and breast cancer. The ORs for CYP1A1 genotypes in the latter two studies were similar to those of the CBCS (ignoring smoking). The previous studies [18,20-22] were smaller than the CBCS.

Most previous studies of CYP1A1 polymorphisms and breast cancer categorized smoking as ever or never, and did not investigate the effects of dose, duration, age at initiation, time since cessation, or exposure to ETS. The study by Ishibe and colleagues was the only one to evaluate the interaction between former smoking and M1 genotype, and the authors observed no difference in breast cancer risk for former smoking according to the M1 genotype [22]. Our results suggest that the CYP1A1 M1 genotype modifies the association between cigarette smoking and breast cancer risk among former smokers. Joint effects of the M1 genotype and former smoking were most evident among those who quit within 10 years of the reference date. ORs for the duration, age at initiation of smoking, and ETS exposure were also stronger in women with M1 genotypes, especially postmenopausal women. Ambrosone and colleagues [18] showed that the CYP1A1 M1 genotype was associated with an increased risk among lighter smokers, but the analysis was based only upon pack-years and did not address dose, duration, or age at initiation. We observed a stronger association among women with M1 alleles who started smoking at an early age (< 18 years), in contrast to Ishibe and colleagues [22] who found no interaction between M1 genotypes and age at initiation of smoking.

We did not observe strong modification of the ORs for smoking according to M2 genotypes. Ishibe and colleagues [22] found that women with the M2 genotypes showed a stronger association for early age at initiation of smoking compared with women with wild-type genotypes. The study by Bailey and colleagues [20] is the only one to evaluate the effect of the M3 genotype on ORs for smoking and breast cancer. In a study that included African-American women, the authors [20] did not observe modification of the ORs for smoking by M3 genotypes, but smoking was categorized as ever versus never. For African-American women with M3 genotypes, we observed a stronger association for cessation of smoking within 10 years of the reference date, for smoking for longer than 20 years, and for initiation of smoking before age 18. These OR estimates are imprecise, and additional studies of M3 genotype and breast cancer in African-American women are needed.

There are several limitations to our study. The participation rate in controls was low (55%) and could have lead to biased estimates of effect. In a previous publication [28], we addressed the potential for selection bias using information from partial interviews conducted on persons who refused to participate in the CBCS. Eligible participants who refused were similar to full participants for most breast cancer risk factors [28], and the prevalence of smoking among controls in the CBCS was similar to previous surveys of the North Carolina population [24]. Participation in the CBCS is unlikely to be related to the CYP1A1 genotype, and therefore any bias in ORs for the CYP1A1 genotype, smoking, or the joint effects of these exposures is likely to be towards the null. Due to financial constraints, we were only able to genotype 80% of phase 1 CBCS cases and 89% of controls. ORs for smoking and other exposures, as well as distributions of these variables, did not differ between study participants with CYP1A1 genotype information and those without (data not shown); the results presented here are therefore likely to be representative of the entire CBCS.

As in previous studies of CYP1A1, RFLP-based laboratory assays do not identify the phase of specific chromosomal combinations of alleles within the CYP1A1 locus (haplotypes). We thus estimated haplotypes using statistical methods, an indirect measurement. Based upon estimated haplotypes, it appears that the effects of the M1 and M3 alleles can be estimated independently in African-American women, since few participants (< 0.01%) appeared to carry the M1 + M3 haplotype. Among white women, some of the effects of M1 may be due to M2, since a small number (4%) of participants carried the M1 + M2 haplotype. Some of the effects of M2 may be due to M4, and vice versa, since the two loci are in close physical proximity. In our study population, the majority of participants with M4 were wild-type at M2, and neither allele was associated with breast cancer risk.

We addressed confounding by most known risk factors for breast cancer, but confounding by unmeasured factors cannot be ruled out. Misclassification of self-reported smoking status is possible. However, it is unlikely that residual confounding or misclassification of smoking would be differential by the CYP1A1 genotype. Information on ETS exposure was limited to exposure at home without measurements of workplace or leisure activity. Failure to fully measure ETS and to remove women with such exposure from the referent group would lead to underestimates (rather than overestimates) of smoking effects.

An additional limitation to our study is the problem of multiple comparisons. We estimated ORs for many aspects of smoking in premenopausal and postmenopausal women, and some or all of the observed associations could be due to chance. We did not base our analysis upon P values and thus did not adjust for multiple comparisons, but instead compared magnitude of ORs across categories of smoking exposure and genotype. Our results appear to be consistent with some previous epidemiologic studies as well as current knowledge about the biologic effects of CYP1A1 alleles. Although our study is the largest to date among African-American women, many of the OR estimates were imprecise, and none of the ICRs were statistically significant. Additional studies with a larger sample size as well as data pooling across studies are needed to determine whether some or all of the ORs and interactions in our study and previous studies may be due to chance.


Our results suggest that CYP1A1 M1-containing and M3-containing genotypes increase the risk of breast cancer associated with a long duration (> 20 years) of cigarette smoking, but the effects of the CYP1A1 genotype appear to be quite weak. Additional information on the functional characteristics of CYP1A1 alleles is needed, especially within breast tissue, to address the biologic plausibility of our findings. Since CYP1A1 is involved in activation of polycyclic aromatic hydrocarbons, our results lend support to the hypothesis that polycyclic aromatic hydrocarbon exposure is associated with increased risk of breast cancer. Polycyclic aromatic hydrocarbon–DNA adducts are formed within breast tissue and have been associated with increased breast cancer risk [29].

Future studies of smoking and breast cancer need to address the role of a variety of genetic polymorphisms involved in the metabolism of polycyclic aromatic hydrocarbons, heterocyclic amines and other compounds found in tobacco smoke. Large studies and data pooling will be required to disentangle the complex effects of smoking. Such studies are important since smoking may represent a modifiable risk factor for breast cancer.

Competing interests

None declared.

Authors' contributions

YL and LC conducted the laboratory assays, C-KT conducted the statistical analyses, and YL, RM, DB, BN and KC participated in interpretation of results and writing the manuscript.


bp = base pairs; CBCS = Carolina Breast Cancer Study; CI = confidence interval; CYP1A1 = cytochrome P4501A1; ETS = environmental tobacco smoke; ICR = interaction contrast ratio; OR = odds ratio; PCR = polymerase chain reaction; RFLP = restriction fragment length polymorphism.


The authors wish to thank the nurse-interviewers for the CBCS for their important contributions and the Molecular Epidemiology (K Conway, Director) and Tissue Procurement Core Laboratories (L Dressler, Director) of the Lineberger Cancer Research Center at University of North Carolina for their technical support. This research was funded in part by the Specialized Program of Research Excellence (SPORE) in Breast Cancer (NIH/NCI P50-CA58223), by Pesticides and Breast Cancer in North Carolina (NIH/NIEHS R01-ES07128), by the Center for Environmental Health and Susceptibility (NIEHS P30-ES10126), and by the Superfund Basic Research Program (NIEHS P42-ES05948).


  1. Palmer J, Rosenberg L: Cigarette smoking and the risk of breast cancer.

    Epidemiol Rev 1993, 15:145-156. PubMed Abstract OpenURL

  2. Terry P, Rohan T: Cigarette smoking and the risk of breast cancer in women: a review of the literature.

    Cancer Epidemiol Biomark Prev 2002, 11:953-971. OpenURL

  3. Terry P, Miller A, Rohan T: Cigarette smoking and breast cancer risk: a long latency period?

    Int J Cancer 2002, 100:723-728. PubMed Abstract | Publisher Full Text OpenURL

  4. Bartsch H, Nair U, Risch A, Rojas M, Wikman H, Alexandrov K: Genetic polymorphism of CYP genes, alone or in combination, as a risk modifier of tobacco-related cancers.

    Cancer Epidemiol Biomarkers Prev 2000, 9:3-28. PubMed Abstract | Publisher Full Text OpenURL

  5. Law M: Genetic predisposition to lung cancer.

    Br J Cancer 1990, 61:195-206. PubMed Abstract OpenURL

  6. Pyykko K, Tuimala R: Is aryl hydrocarbon hydroxylase activity a new prognostic factor of breast cancer?

    Br J Cancer 1991, 63:596-600. PubMed Abstract OpenURL

  7. McKay J, Murray G, Ah-See A, Greenlee W, Craig B, Burke M, Melvin W: Differential expression of CYP1A1 and CYP1B1 in human breast cancer [abstract].

    Biochem Soc Transact 1996, 24:327s. OpenURL

  8. Williams J, Stone E, Millar B, Gusterson B, Grover P, Phillips D: Determination of the enzymes responsible for activation of the heterocylic amine 2-amino-3-methylimidazo[4,5-f]quinoline in the human breast.

    Pharmacogenetics 1999, 8:519-528. OpenURL

  9. Kawajiri K, Nakachi K, Imai K, Shinoda N, Watanabe J: Identification of genetically high risk individuals to lung cancer by DNA polymorphisms of the cytochrome P4501A1 gene.

    FEBS Lett 1990, 263:131-133. PubMed Abstract | Publisher Full Text OpenURL

  10. Hayashi S, Watanabe J, Nakachi K, Kawajiri K: Genetic linkage of lung cancer-associated MSP I polymorphism with amino acid replacement in the heme binding region of the human cytochrome P4501A1 gene.

    J Biochem 1991, 110:407-411. PubMed Abstract OpenURL

  11. Crofts F, Cosma G, Currie D, Taioli E, Toniolo P, Garte S: A novel CYP1A1 gene polymorphism in African-Americans.

    Carcinogenesis 1993, 14:1729-1731. PubMed Abstract OpenURL

  12. Cascorbi I, Brockmoller J, Roots I: A C4887A polymorphism in exon 7 of human CYP1A1: population frequency, mutation linkages, and impact on lung cancer susceptibility.

    Cancer Res 1996, 56:4965-4969. PubMed Abstract OpenURL

  13. Cosma G, Crofts F, Taioli E, Toniolo P, Garte S: Relationship between genotype and function of the human CYP1A1 gene.

    J Toxicol Environ Health 1993, 40:309-316. PubMed Abstract OpenURL

  14. Kiyohara C, Hirohata T, Inutsuka S: The relationship between aryl hydrocarbon hydroxylase and polymorphisms of the CYP1A1 gene.

    Jpn J Cancer Res 1996, 87:18-24. PubMed Abstract OpenURL

  15. Crofts F, Taioli E, Trachman J, Cosma G, Currie D, Toniolo P, Garte S: Functional significance of different human CYP1A1 genotypes.

    Carcinogenesis 1994, 15:2961-2963. PubMed Abstract OpenURL

  16. Petersen D, McKinney C, Ikeya K, Smith H, Bale A, McBride O, Nebert D: Human CYP1A1 gene: cosegregation of the enzyme inducibility phenotype and an RFLP.

    Am J Hum Genet 1991, 48:720-725. PubMed Abstract OpenURL

  17. Landi M, Bertazzi T, Shields P, Clark P, Lucier G, Garte S, Cosma G, Caporaso N: Association between CYP1A1 genotype, mRNA expression and enzymatic activity in humans.

    Pharmacogenetics 1994, 4:242-246. PubMed Abstract OpenURL

  18. Ambrosone C, Freudenheim J, Graham S, Marshall J, Vena J, Brasure J, Laughlin R, Nemoto T, Michalek A, Harrington A, Ford T, Shields P: Cytochrome p450 1A1 and glutathione S-tansferase (M1) genetic polymorphisms and postmenopausal breast cancer risk.

    Cancer Res 1995, 55:3483-3485. PubMed Abstract OpenURL

  19. Rebbeck T, Rosvold E, Duggan D, Zhang J, Buetow K: Genetics of CYP1A1: coamplification of specific alleles by polymerase chain reaction and association with breast cancer.

    Cancer Epidemiol Biomarker Prev 1994, 3:511-514. OpenURL

  20. Bailey L, Roodi N, Verrier C, Yee C, Dupont W, Parl F: Breast cancer and CYP1A1, GSTM1, and GSTT1 polymorphism: evidence of a lack of association in Caucasians and African-Americans.

    Cancer Res 1998, 58:65-70. PubMed Abstract OpenURL

  21. Taioli E, Trachman J, Chen X, Toniolo P, Garte S: A CYP1A1 restriction fragment length polymorphism is associated with breast cancer in African-American women.

    Cancer Res 1995, 55:3757-3758. PubMed Abstract OpenURL

  22. Ishibe N, Hankinson S, Colditz G, Spiegelman D, Willett W, Speizer F, Kelsey K, Hunter D: Cigarette smoking, cytochrome p4501A1 polymorphisms and breast cancer risk in the nurses' Health Study.

    Cancer Res 1998, 58:667-671. PubMed Abstract OpenURL

  23. Newman B, Moorman P, Millikan R, Qaqish B, Geradts J, Aldrich T, Liu E: The Carolina Breast Cancer study: integrating population-based epidemiology and molecular biology.

    Breast Cancer Res Treat 1995, 35:51-60. PubMed Abstract OpenURL

  24. Millikan R, Pittman G, Newman B, Tse C-K, Selmin O, Rockhill B, Savitz D, Moorman P, Bell D: Cigarette smoking, N-acetyltransferases 1 and 2, and breast cancer risk.

    Cancer Epidemiol Biomarkers Prev 1998, 7:371-378. PubMed Abstract OpenURL

  25. Bartsch H, Nair U, Risch A, Rjoas M, Wilkman H, Alexandrov K: Genetic polymorphisms of CYP genes, alone or in combination, as a risk modifier of tobacco-related cancers.

    Cancer Epidemiol Biomark Prev 2000, 9:3-28. OpenURL

  26. Zhao J, Curtis D, Sham P: Model-free analysis and permutation tests for allelic association.

    Hum Hered 2000, 50:133-139. PubMed Abstract | Publisher Full Text OpenURL

  27. Hosmer D, Lemeshow S: Confidence interval estimation of interaction.

    Epidemiology 1992, 3:452-456. PubMed Abstract OpenURL

  28. Moorman P, Newman B, Millikan R, Tse C-K, Sandler D: Participation rates in a case-control study: the impact of age, race, and race of interviewer.

    Ann Epidemiol 1999, 9:188-195. PubMed Abstract | Publisher Full Text OpenURL

  29. Rundle A, Tang D, Hibshoosh H, Schnabel F, Kelly A, Levine R, Zhou J, Link B, Perera F: Molecular epidemiology studies of polycyclic aromatic hydrocarbon-DNA adducts and breast cancer.

    Environ Mol Mutagen 2002, 39:201-207. PubMed Abstract | Publisher Full Text OpenURL