Marilyn Johnson v. City of Memphis , 2014 FED App. 0271P ( 2014 )


Menu:
  •                           RECOMMENDED FOR FULL-TEXT PUBLICATION
    Pursuant to Sixth Circuit I.O.P. 32.1(b)
    File Name: 14a0271p.06
    UNITED STATES COURT OF APPEALS
    FOR THE SIXTH CIRCUIT
    _________________
    MARILYN JOHNSON, et al.,                               ┐
    Plaintiffs-Appellants/Cross-Appellees,      │
    │
    │       Nos. 13-5452/5454
    v.                                              │
    >
    │
    CITY OF MEMPHIS,                                       │
    Defendant-Appellee/Cross-Appellant.        │
    ┘
    Appeal from the United States District Court
    for the Western District of Tennessee at Memphis.
    Nos. 2:00-cv-02608; 2:04-cv-02013; 2:04-cv-02017—S. Thomas Anderson, District Judge.
    Argued: January 30, 2014
    Decided and Filed: October 27, 2014
    Before: SUHRHEINRICH, GIBBONS, and COOK, Circuit Judges
    _________________
    COUNSEL
    ARGUED: David M. Sullivan, Memphis, Tennessee, for Appellants/Cross-Appellees. Louis P.
    Britt, FORD & HARRISON LLP, Memphis, Tennessee, for Appellee/Cross-Appellant. ON
    BRIEF: David M. Sullivan, Memphis, Tennessee, for Appellants/Cross-Appellees. Louis P.
    Britt, J. Dylan King, Joshua J. Sudbury, FORD & HARRISON LLP, Memphis, Tennessee, for
    Appellee/Cross-Appellant.
    _________________
    OPINION
    _________________
    COOK, Circuit Judge. After more than thirteen years of litigation, including a bench
    trial, numerous preliminary injunctions, and a previous appeal affirming the grant of injunctive
    relief for some plaintiffs, see Johnson v. City of Memphis (“Johnson Appeal I”), 444 F. App’x
    1
    Nos. 13-5452/5454            Johnson, et al. v. City of Memphis                                    Page 2
    856, 861 (6th Cir. 2011), three consolidated cases challenging the City of Memphis’s (“City”)
    police promotional processes as racially discriminatory return on cross-appeals. The appeals
    address two allegedly discriminatory sergeant promotional processes that occurred in 2000 and
    2002 (the “2000 process” and “2002 process”1), targeting three matters decided by the district
    court at different phases of the litigation: (1) the order dismissing plaintiffs’ negligence claim
    concerning the already-invalidated 2000 process under Tennessee’s governmental-immunity
    statute, Tenn. Code Ann. § 29-20-205; (2) the bench-trial decision invalidating the 2002 process
    for violating Title VII’s disparate-impact prohibition, see 42 U.S.C. § 2000e-2(k)(1); and (3) the
    final judgment and related orders awarding back pay and interest to plaintiffs and more than
    $1 million in fees and expenses to their attorneys. Both the plaintiffs and the City appeal various
    aspects of these decisions.
    For the following reasons, we affirm in part and reverse in part the district court’s
    judgment, and we remand the fees issues for further consideration.
    I. BACKGROUND
    We briefly summarize the factual background of these cases thoroughly detailed in the
    district court’s bench-trial opinion.           The City’s promotional processes have engendered
    controversy for nearly forty years, prompting numerous lawsuits alleging racial and gender
    discrimination by such parties as the United States Department of Justice, the Afro-American
    Police Association, and white and minority officers. See Aiken v. City of Memphis, 
    37 F.3d 1155
    , 1158–60 (6th Cir. 1994) (en banc) (detailing the extensive litigation history). Despite the
    City’s repeated assurances of adopting race-neutral promotional processes, we observed that, as
    of the mid-1990s, “incredibly, the City continue[d] to make police and fire department
    promotions according to procedures that have not been validated as racially neutral.” 
    Id. at 1164.
    The City responded with a 1996 promotional process (“1996 process”) designed by Dr.
    Mark Jones, an industrial and organizational psychologist, and overseen by a Department of
    Justice consultant. The 1996 process consisted of four components, weighted as follows: a
    “high-fidelity” law enforcement role-play exercise, 50%; written test, 20%; performance
    1
    We refer to the second promotion period as the “2002 process,” even though the City administered the test
    in September 2001, for consistency with the parties’ arguments and our previous decision.
    Nos. 13-5452/5454              Johnson, et al. v. City of Memphis                    Page 3
    evaluations, 20%; and seniority, 10%. Arbitration proceedings involving claims under the City’s
    Memorandum of Understanding with the police union ensued, but no Title VII litigation resulted.
    Dr. Jones modeled the City’s next promotion protocol after the 1996 process, replacing
    the role-play component with a video-based practical test because of security and practicability
    concerns. The 1996 simulation had taken more than two months (testing and scoring) to evaluate
    individually more than 400 candidates, and the City discovered problems with candidate
    coaching during the exercise. The following components initially comprised the 2000 process: a
    “low-fidelity” (i.e., no role-play) video-based practical test, 50%; job knowledge test, 20%;
    performance evaluations, 20%; seniority, 10%. After the City discovered that leaked answers
    compromised the results of the video test, the City excluded the video test and reweighted the
    remaining test components. The adjustments to the 2000 process prompted the first of these
    disparate-impact cases, Johnson v. City of Memphis, No. 00–2608, and the City ultimately
    consented to the invalidation of the 2000 process by Judge Jon McCalla in June 2001. (See R.
    58, Order at 1–2.2)
    Attempting to avoid the test-security issues encountered in the previous two promotional
    periods, the City hired outside consultants Jeanneret & Associates to design the replacement tests
    that would become the 2002 process. After the City submitted a testing proposal to the district
    court, Judge McCalla held a status conference to hear plaintiffs’ objections and instructed
    plaintiffs’ expert to work with the City’s expert, Dr. Richard Jeanneret. The City addressed the
    concerns raised by plaintiffs’ expert, and the district court granted the City’s motion to proceed
    with the 2002 process.            The 2002 process included the following equally weighted test
    components: an investigative logic test; a job-knowledge test; an application-of-knowledge test;
    a grammar and clarity test; and a “low-fidelity” video-based practical test.
    The City administered the 2002 process to 517 applicants between September 27–29,
    2001, and completed grading in fall 2002. Raw scores ranged from 174.75–358.75 out of a
    possible 384.5 points. The City converted these scores to a 100-point scale and then—honoring
    an agreement with the officers’ union—added up to 10 points for seniority to the final promotion
    score. Promotion scores ranged from 53.511–103.303, of a possible 110 points. Despite the
    2
    All record citations refer to case No. 00-2608.
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                           Page 4
    City’s efforts, the 2002 process resulted in minority candidates scoring disproportionately worse
    than white candidates. Using Dr. Jeanneret’s rank-ordered promotion scores, the City promoted
    86 of the 274 African-American candidates (31.4%) and 176 of the 240 white candidates
    (73.3%). The original plaintiffs amended their pleadings to challenge the disparate impact of the
    2002 process, and two additional lawsuits—Johnson v. City of Memphis, No. 04–2017, and
    Billingsley v. City of Memphis, No. 04–2013—joined the consolidated proceedings, which had
    been reassigned to then-District Judge Bernice Donald in September 2001.
    The district court held a bench trial in July 2005 and issued its decision in December
    2006. Its Memorandum Opinion and Order on Remedies rejected all claims except plaintiffs’
    Title VII disparate-impact claims as to the 2002 process. The court found that, while the 2002
    sergeant test was valid and reliable, less discriminatory valid alternatives were available and,
    thus, the 2002 process violated Title VII. Though the court ordered the promotion of all minority
    plaintiffs, with back pay and seniority, it denied plaintiffs’ request, at that time, to compete for
    promotion to the rank of lieutenant because they lacked the requisite two years’ experience as
    sergeant.    See Johnson Appeal I, 444 F. App’x at 857 (detailing district court’s
    procedural history).
    Following the bench-trial decision, the district court fielded a variety of remedies-related
    motions for injunctions and stays between 2007 and 2010. Because so much time had passed
    since the problematic 2000 and 2002 processes, plaintiffs’ alleged injuries, in terms of lost pay
    and seniority, spilled over into subsequent promotional processes, as plaintiffs were denied the
    opportunity to apply for additional promotions. At different points, court orders relying on the
    Title VII judgment invalidating the 2002 process permitted plaintiffs to participate in those
    promotions, see generally Johnson Appeal I, 444 F. App’x at 857 (lieutenant promotions), but
    the district court repeatedly denied plaintiffs’ request for additional retroactive seniority and
    back pay.
    In March 2010, the court entered a preliminary injunction ordering the immediate
    promotion to the rank of lieutenant of 28 plaintiffs with passing exam scores and sufficient work
    experience, and we affirmed in Johnson Appeal I, 444 F. App’x at 857–58, 861. In affirming the
    preliminary injunction, the panel expressed “concern[] at the degree of delay” of “this case, now
    Nos. 13-5452/5454             Johnson, et al. v. City of Memphis                                     Page 5
    in its eleventh year,” and admonished that it would entertain a mandamus petition if the district
    court failed to enter a final judgment within the next six months. 
    Id. at 861
    (noting that the
    district court’s 2006 bench-trial decision “remains interlocutory almost five years later”). After
    plaintiffs petitioned for mandamus in January 2013, the district court awarded back pay, interest,
    and attorneys’ fees and entered a final judgment, whereupon plaintiffs voluntarily dismissed their
    mandamus action.
    The plaintiffs appeal the immunity-based denial of their negligence claim related to the
    2000 process and various remedies and attorneys’ fees issues related to the 2000 and 2002
    processes; the City cross-appeals the district court’s Title VII judgment invalidating the 2002
    process and the related million-dollar attorneys’ fees award; and the plaintiffs present an
    alternative legal justification3 for the Title VII judgment against the 2002 process.
    II. JOHNSON I PLAINTIFFS’ APPEAL: NEGLIGENCE CLAIM, 2000 PROCESS
    First, the non-minority Johnson I plaintiffs dispute the application of governmental
    immunity to their negligence claim, targeting the already-invalidated 2000 process. They press
    this claim—their only one seeking damages—arguing that the decisionmakers responsible for the
    2000 process committed non-discretionary acts ineligible for immunity. We review the district
    court’s grant of summary judgment de novo. Ciminillo v. Streicher, 
    434 F.3d 461
    , 464 (6th Cir.
    2006).
    According to the Johnson I plaintiffs, City officials violated a key provision of the City
    Charter requiring the use of “practical tests” in the promotion process. Specifically, they object
    to the City’s exclusion of the interactive, video-based component of the 2000 process upon
    discovering that some candidates received advance notice of the questions.
    The district court rejected this argument, finding that “the decisions concerning what type
    of test to use, how to weight the various testing components, and how the tests are to be
    3
    Though styled a “conditional cross-appeal” in plaintiffs’ response brief, we construe the argument as an
    alternative legal justification for the district court’s judgment. See ASARCO, Inc. v. Sec’y of Labor, 
    206 F.3d 720
    ,
    722 (6th Cir. 2000) (“It is a well settled principle that a prevailing party cannot appeal an unfavorable aspect of a
    decision in its favor.”); see also Freeze v. City of Decherd, 
    753 F.3d 661
    , 664 (6th Cir. 2014) (“Appellate courts
    reviewing grants of summary judgment may affirm on any grounds supported by the record.”); Abel v. Dubberly,
    
    210 F.3d 1334
    , 1338 (11th Cir. 2000) (applying similar standard to post-trial motions for judgment as a matter of
    law, considering preserved alternative legal arguments).
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                           Page 6
    administered are left to the discretion of the director of personnel,” and noting that the Charter’s
    practical-test requirement “must be interpreted by those in a position to make such decisions for
    [the City].” We agree with the district court.
    Tennessee’s Governmental Tort Liability Act (GTLA) immunizes the state’s public
    officials from negligence suits where “the injury arises out of . . . [t]he exercise or performance
    . . . of a discretionary function, whether or not the discretion is abused.” Tenn. Code Ann. § 29-
    20-205(1). Tennessee courts measure the scope of this immunity with the “planning-operational
    test.” Giggers v. Memphis Hous. Auth., 
    363 S.W.3d 500
    , 507 (Tenn. 2012). Because arguably
    “every act involves discretion,” courts must “examin[e] (1) the decision-making process and
    (2) the propriety of judicial review of the resulting decision.” Bowers v. City of Chattanooga,
    
    826 S.W.2d 427
    , 431 (Tenn. 1992).          Whereas discretionary “planning decision[s] usually
    involve[] consideration and debate regarding a particular course of action by those charged with
    formulating plans or policies,” non-discretionary “[o]perational decisions . . . implement
    preexisting laws, regulations, policies, or standards” and “do[] not involve the formulation of
    new policy.” 
    Giggers, 363 S.W.3d at 507
    –08. Accordingly, we must determine whether the
    City Charter and ordinance prescribe sufficient instructions such that the formulation and
    modification of the 2000 process can be deemed operational, as opposed to discretionary.
    Contrary to the Johnson I plaintiffs’ suggestion, the City Charter and related ordinance do
    not require “practical tests.” Rather, they provide that employment examinations “shall be of a
    practical nature and relate to such matters as will fairly test the relative competency of the
    applicant to discharge the duties of the particular position.” (R. 656-25, City Charter § 250.1
    (emphasis added); accord R. 656-26, Civil Service Ordinance § 9-3.) This subtle difference
    suggests that the regulations provide a broad instruction that examinations test actual job
    functions, instead of a strict requirement for a specific type of interactive exercise, like a
    simulation or video-based test. Other aspects of the Charter provision similarly support treating
    test-design as a discretionary function.         (See R. 656-25, City Charter § 250.1 (requiring
    “competitive job-related examinations under such rules and regulations as may be adopted by the
    Director of Personnel,” and providing that the exams “should be developed in conjunction with
    other tools of personnel assessment and . . . sound programs of job design to aid significantly in
    Nos. 13-5452/5454           Johnson, et al. v. City of Memphis                       Page 7
    the development and maintenance of an efficient work force and in the utilization and
    conservation of human resources”).)        Plaintiffs offer no authority supporting their narrow
    interpretation. Nor do they explain how the Charter and ordinance preclude the City from taking
    the sensible step of voiding a compromised component of its employment examination.
    The district court correctly recognized that City officials must interpret and implement
    the Charter’s broad guidance in devising fair and effective promotional processes.            In the
    absence of specific regulations confining the City’s discretion, GTLA immunity shields this
    discretionary decision. See 
    Giggers, 363 S.W.3d at 507
    –08. We therefore AFFIRM the district
    court’s grant of partial summary judgment to the City on this claim.
    III. CITY’S CROSS-APPEAL: TITLE VII JUDGMENT, 2002 PROCESS
    Next, the City cross-appeals the district court’s bench-trial ruling finding a Title VII
    disparate-impact violation. The parties agree that plaintiffs presented a prima facie case of the
    2002 process’s disparate impact; the City promoted 264 of the 517 candidates, with a substantial
    disparity between the success rate of non-minority (175/240) and African-American candidates
    (86/274). The City argues, however, that the court applied an unduly deferential legal standard
    in finding that plaintiffs showed less discriminatory alternatives to the 2002 process.
    We review the court’s legal conclusions de novo and findings of fact for clear error. E.g.,
    Beaven v. U.S. Dep’t of Justice, 
    622 F.3d 540
    , 547 (6th Cir. 2010).
    A. The Title VII Disparate-Impact Standard
    Though Title VII disparate-impact claims originated with the Supreme Court’s decision
    in Griggs v. Duke Power Co., 
    401 U.S. 424
    (1971), Congress codified the disparate-impact
    standard in the Civil Rights Act of 1991. See 42 U.S.C. § 2000e-2(k)(1); Ricci v. DeStefano,
    
    557 U.S. 557
    , 577–78 (2009). Courts assess the viability of these claims using a three-step
    burden-shifting framework akin to the familiar McDonnell-Douglas standard. See 42 U.S.C.
    § 2000e-2(k)(1)(A)–(k)(1)(C); Black Law Enforcement Officers Ass’n v. City of Akron, 
    824 F.2d 475
    , 480 (6th Cir. 1987).
    [First,] a plaintiff establishes a prima facie violation by showing that an employer
    uses “a particular employment practice that causes a disparate impact on the basis
    of race, color, religion, sex, or national origin.” 42 U.S.C. § 2000e-2(k)(1)(A)(i).
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                              Page 8
    [Second, the] employer may defend against liability by demonstrating that the
    practice is “job related for the position in question and consistent with business
    necessity.” 
    Ibid. [Third,] . .
    . if the employer meets that burden, . . . [the] plaintiff
    may still succeed by showing that the employer refuses to adopt an available
    alternative employment practice that has less disparate impact and serves the
    employer’s legitimate needs. §§ 2000e-2(k)(1)(A)(ii) and (C).
    
    Ricci, 557 U.S. at 578
    ; see also Davis v. Cintas Corp., 
    717 F.3d 476
    , 494–95 (6th Cir. 2013).
    The City contests plaintiffs’ step-three showing of less discriminatory alternatives. To
    satisfy this element, the plaintiff must demonstrate: (1) the availability of alternative procedures
    that serve the employer’s legitimate interests and (2) produce “substantially equally valid”
    results, but with (3) less discriminatory outcomes. 29 C.F.R. § 1607.3(B); see also Watson v.
    Fort Worth Bank & Trust, 
    487 U.S. 977
    , 998 (1988); Shollenbarger v. Planes Moving &
    Storage, 297 F. App’x 483, 486–87 (6th Cir. 2008). As with Title VII claims of intentional
    discrimination, disparate-impact plaintiffs bear the burdens of production and persuasion at this
    step. 42 U.S.C. §§ 2000e(m), 2000e-2(k)(1)(A)(i)–(ii). Consequently, plaintiffs may not rest on
    speculation regarding the availability, validity, or less discriminatory nature of their proffered
    alternatives. See, e.g., Allen v. City of Chicago, 
    351 F.3d 306
    , 313, 316–17 (7th Cir. 2003)
    (deeming insufficient “vague or fluctuating” alternatives, and finding that the plaintiffs failed to
    substantiate their “bare assertion” of valid, less discriminatory alternatives); Shollenbarger,
    297 F. App’x at 487 (emphasizing that “[t]he plaintiffs [a]re obligated to prove equally effective
    alternatives,” and that “[t]he purpose of [step three] is not to second guess the employer’s
    business decisions”).
    B. Components of the 2002 Process & Plaintiffs’ Proposed Alternatives
    As noted above, the 2002 process consisted of five testing components: (1) a “low-
    fidelity” video test, which required oral responses to video depictions of law enforcement
    scenarios; (2) an investigative logic test, consisting of multiple-choice and short-answer
    questions; (3) an open-book job-knowledge test; (4) an application test, with weighted scores
    differentiating between the most and least effective responses; and (5) a written communications
    exam testing for grammar and clarity.
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                             Page 9
    As they did before the district court, plaintiffs assert three available alternatives to
    improve the 2002 process: (1) the 1996 process’s high-fidelity role-playing exercise, which
    required candidates to respond to simulated law-enforcement scenarios (“1996 simulation”);
    (2) assessments of candidates’ “integrity” and “conscientiousness”; and (3) a merit-promotion
    system similar to one used by the Chicago Police Department, which consists of interviews by
    merit-review boards. Yet, in arguing before this court for these alternatives, they shirk their duty
    to demonstrate the benefits of the Chicago-plan and integrity/conscientiousness theories,
    defending only the 1996 simulation as equally valid and less discriminatory. (Third Br. at 31–
    37.) Similarly problematic, plaintiffs neglect to explain how any of these alternatives would fit
    into the 2002 process, but we gather that they would either replace or complement its existing
    components.
    Plaintiffs vouch for the 1996 simulation by pointing to its past success, including a
    sterling validation report documenting its non-discriminatory results. They also tout its benefits
    compared to the less practical (i.e., less like actual job duties), low-fidelity video test used in the
    2002 process. Finally, they rely on their expert’s claim that the 1996 simulation is more valid
    than the 2002 tests and “easily replicated.” (See Third Br. at 32–35; R. 648-13, Trial Tr.
    (DeShon) at 1681–82; see also R. 648-15, Trial Tr. (DeShon) at 1848 (likening the difference
    between high-fidelity simulations and low-fidelity response exercises to “knowing versus
    doing”).)
    C. The District Court’s Bench-Trial Findings Regarding Available Alternatives
    After summarizing the proffered alternatives, which the court characterized as “broad
    suggestions [of] alternative testing modalities,” the court found that plaintiffs satisfied the step-
    three burden of demonstrating available, equally valid, less discriminatory alternatives.            It
    reasoned as follows:
    It is of considerable significance that the City had achieved a successful
    promotional program in 1996 and yet failed to build upon that success. While the
    1996 process was not perfect it appears to have satisfied all of the legal
    requirements of promotional processes. The 2000 process departed substantially
    from the 1996 model in its abandonment of the practical exercise and re-
    weighting of the remaining elements. The 2002 processes, while arguably more
    Nos. 13-5452/5454          Johnson, et al. v. City of Memphis                       Page 10
    sophisticated than its predecessors, suffered from a grossly disproportionate
    impact on minority candidates.
    It is unnecessary for the Court to scrutinize the advisability of
    incorporating assessments of qualities such as integrity and conscientiousness or
    the relative merits of the Chicago process. It is sufficient to acknowledge that the
    existence of such alternative measures and methods belies, as Plaintiffs suggest,
    Defendants’ position that they had no choice but to go forward with the 2002
    promotion process despite its adverse impact because no alternative methods with
    less adverse impact were available.
    Defendant argues that Plaintiffs have failed to meet their burden because
    none of the alternatives now suggested were proposed at the time the 2002
    process was implemented. This argument misconstrues the appropriate standard.
    Plaintiffs must prove that there was “another available method of evaluation
    which was equally valid and less discriminatory.” Bryant v. City of Chicago,
    
    200 F.3d 1092
    , 1094 (7th Cir. 2000) (emphasis added). Plaintiffs are not required
    to have proposed the alternative. The requirement is only that the alternative was
    available. The Court reads “availability” in this context to mean that Defendant
    either knew or should have known that such an alternative existed. Plaintiffs have
    amply demonstrated that Defendant knew of all three alternatives they have
    set forth.
    (R. 388, Bench Trial Op. at 25–26.)
    Notably, the court relies on the relative success of the 1996 test, without (1) requiring
    evidence that the 2002 process would benefit from incorporating the 1996 test’s simulation, or
    (2) addressing the City’s interest in test-security, in light of the 1996 simulation’s documented
    cheating.    Also, the district court expressly declines to consider the merits of the
    integrity/conscientiousness and Chicago-plan alternatives, resting its conclusion solely on the
    City’s denial of alternatives.
    D. The City’s Challenge to the Court’s Analysis
    The City challenges the district court’s judgment, asserting both legal error
    and factual deficiencies with plaintiffs’ step-three showing. Though plaintiffs characterize the
    City’s argument as an attack on the district court’s factual findings, invoking the deference
    of clear-error review, the district court’s analysis contains legal errors subject to our de novo
    review. 
    Beaven, 622 F.3d at 547
    .
    Nos. 13-5452/5454          Johnson, et al. v. City of Memphis                        Page 11
    First,   the    district   court   readily   admits   crediting   the   Chicago-plan     and
    integrity/conscientiousness alternatives without considering their relative merit; this approach
    conflicts with Title VII’s requirement that plaintiffs prove the availability of equally valid, less
    discriminatory measures.      See 42 U.S.C. §§ 2000e(m), 2000e-2(k)(1)(A)(i)–(ii); 29 C.F.R.
    § 1607.3(B); 
    Allen, 351 F.3d at 316
    –17; Shollenbarger, 297 F. App’x at 487.
    Second, the district court accords “considerable significance” to the results of the 1996
    simulation with no discussion of the City’s test-security concerns. Courts recognize employers’
    legitimate interest in preserving the integrity of their employment processes. E.g., Hearn v. City
    of Jackson, 
    340 F. Supp. 2d 728
    , 742 (S.D. Miss. 2003) (overruling disparate-impact plaintiffs’
    proposal requiring all applicants to complete a lengthy, interview-based selection procedure,
    noting the city’s legitimate interests in resource preservation, avoiding the appearance of
    selection bias, and preventing later applicants from obtaining the questions in advance), aff’d,
    110 F. App’x 424 (5th Cir. 2004) (per curiam).
    Here, the City presented undisputed evidence that leaked information and candidate
    coaching compromised both the 1996 simulation and its 2000-process replacement, a video-
    based test of law enforcement techniques. (R. 648-6, Trial Tr. (Jones) at 863–65 (discussing the
    “coaching” problems experienced with the 1996 simulation); R. 648-16, Trial Tr. (Claxton) at
    2003 (explaining that City employees were excluded from the creation of the 2002 process,
    because “city employees are accused of funneling questions and/or answers to participants in a
    prior process”).)     Though candidate coaching did not affect the outcome of the 1996
    simulation—evaluators helped poor-performing candidates who would not qualify for
    promotion—it exposed a security flaw, and the 1996 process’s designer testified that the
    simulator “was [the] weakest link” of the process, noting that “it contributed to most of the race
    differences” arising from the 1996 process’s testing methodologies. (R. 648-7, Trial Tr. (Jones)
    at 921–22.) The parties certainly knew of these security problems during the development of the
    2002 process, as evidenced by Judge McCalla’s statements at the parties’ June 27, 2001 status
    conference. (See, e.g., R. 656-17, 6/27/01 Hr’g Tr. at 42 (“[T]he issues that arose in the previous
    test, we don’t want to run the chance of affecting the outcome of the test by giving out
    unnecessary information . . . .”).)
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 12
    Third, the district court’s analysis elides the City’s concern regarding the impracticability
    of the 1996 simulation, which required numerous actors to portray the two-hour law enforcement
    scenarios and took nearly three months to evaluate more than 400 applicants. (See R. 648-6,
    Trial Tr. (Jones) at 863–66.) As the City’s expert explained, the protracted nature of simulation
    testing and the number of moving parts reinforced the City’s concerns about testing security.
    (Id.; see also R. 648-11, Trial Tr. (Jeanneret) at 1461 (citing “all of the issues that had been
    raised about the [City’s testing] and the confidentiality and . . . prior knowledge of the test
    and . . . the integrity of the process” as reasons he declined to use the 1996 process).) The court
    should have accounted for the City’s legitimate interests in test security and practicability in
    assessing plaintiffs’ proffered alternatives. See 
    Watson, 487 U.S. at 998
    (plurality) (“Factors
    such as the cost or other burdens of proposed alternative selection devices are relevant in
    determining whether they would be equally as effective as the challenged practice in serving the
    employer’s legitimate business goals.”); see also 
    Allen, 351 F.3d at 314
    –15 (considering
    proposal’s effect on the city-employer’s financial interests); Clady v. Cnty. of Los Angeles, 
    770 F.2d 1421
    , 1432 (9th Cir. 1985) (“Financial concerns are legitimate needs of the employer.”);
    Chrisner v. Complete Auto Transit, Inc., 
    645 F.2d 1251
    , 1263 (6th Cir. 1981) (“Of course, the
    marginal cost of another hiring policy and its implications for public safety are factors which
    should not be omitted from consideration.”).
    Finally, the Seventh Circuit’s decision in Allen persuades us that the district court erred
    by relying solely on the past success of the 1996 process in determining that the 2002 process
    should have incorporated a live simulation. Allen similarly involved police officers’ challenge to
    a city’s promotion process. The officers proposed eliminating the written job-skills test from the
    process, so as to give full weight to merit-review boards. See 
    Allen, 351 F.3d at 316
    –17. Noting
    the absence of “evidence that merit selection is inherently less likely to cause a disparate impact”
    than the other testing procedures, the court rejected this proposal and affirmed the grant of
    summary judgment to the city, explaining that “[t]he non-discriminatory history of past merit
    selection in the [Chicago Police Department] is not sufficient evidence to withstand the City’s
    motion for summary judgment.” 
    Id. at 317.
    Nos. 13-5452/5454            Johnson, et al. v. City of Memphis                                  Page 13
    In sum, these legal errors improperly shifted plaintiffs’ evidentiary burden to the City,
    undermining the district court’s judgment. At a minimum, we must vacate the district court’s
    Title VII judgment. The City asks us to go further, though, and find plaintiffs’ step-three
    showing insufficient as a matter of law. We thus must decide whether plaintiffs’ evidence
    presents a triable issue as to the availability of equally valid, less discriminatory testing
    alternatives. It does not.
    E. Plaintiffs’ Insufficient Step-Three Showing
    As noted above, the plaintiffs’ appellate briefing defends the validity and racial impact of
    only the 1996 simulation. The plaintiffs first point to the 1996 process’s validation report and
    the City’s Answer, which concedes that the 1996 process resulted in no adverse impact. The
    plaintiffs next highlight their expert’s testimony regarding the difference between high-fidelity
    simulations and the 2002 process’s low-fidelity video test. Third, the plaintiffs claim that
    statistical evidence shows that the 1996 simulation had higher content validity and lower
    disparate-impact scores than the 2002 process’s tests. Finally, the plaintiffs stress the simplicity
    and affordability of the 1996 process compared to the 2002 process.                       The scant evidence
    supporting these claims dooms plaintiffs’ reliance on the 1996 simulation as satisfying its step-
    three burden.
    Beginning with the results of the 1996 process as a whole, that evidence does not
    persuade inasmuch as plaintiffs do not seek to substitute the entire 1996 process for the
    2002 process.
    As for the expert testimony, plaintiffs’ expert, Dr. Richard DeShon, asserted that high-
    fidelity exercises have greater validity than video-based tests, explaining that law enforcement
    simulations, like pilot simulators, require the candidate to perform the necessary tasks under
    realistic conditions. (See R. 648-4, Trial Tr. (DeShon) at 533; R. 648-15, Trial Tr. (DeShon) at
    1848.4) But plaintiffs’ briefing offers no data showing that simulations provide equally valid and
    4
    We note that Dr. DeShon’s initial report in May 2004—more than two years after the administration of the
    2002 process—advocated for both “role plays and video assessments” as less discriminatory testing methods than
    written tests. (R. 656-4, DeShon Rpt. at 14.) After Dr. Jeanneret’s responsive report alerted him to the
    2002 process’s inclusion of a video exam (R. 656-5, Jeanneret Resp. Rpt. at 29), Dr. DeShon issued a supplemental
    report in February 2005 championing high-fidelity simulations, specifically the one used in the 1996 process (R.
    656-6, DeShon Suppl. Rpt. at 23).
    Nos. 13-5452/5454               Johnson, et al. v. City of Memphis                                       Page 14
    less discriminatory evaluations than other forms of practical tests.5 Moreover, the virtues cited
    by Dr. DeShon expose another problem with work simulations: scoring subjectivity.
    Subjective testing mechanisms open the door to random results and real and perceived
    scoring bias. See, e.g., 
    Allen, 351 F.3d at 315
    (“This court previously has noted the potential
    objection to subjective components of evaluation in selection procedures.”); Hearn, 
    340 F. Supp. 2d
    at 742 (rejecting panel-interviews proposal, explaining that they “could have contributed to a
    feeling among candidates that the process was not fair and unbiased”); Nash v. Consol. City of
    Jacksonville, 
    895 F. Supp. 1536
    , 1553 (M.D. Fla. 1995) (rejecting subjective performance
    evaluations, expressing concern that they “would open the process to favoritism, politics and
    tokenism”), aff’d, 
    85 F.3d 643
    (11th Cir. 1996). Tellingly, plaintiffs’ counsel acknowledged this
    problem during the formulation of the 2002 process when he objected to the inclusion
    of subjective testing components. (See R. 657-1, Feb. 26, 2001 Letter to City’s Expert at 4.)
    Equally revealing, plaintiffs’ appellate briefing remains silent on the subjectivity problem.
    We might overlook this pitfall if plaintiffs proffered evidence detailing how a subjective
    component could be scored so as to minimize disparate impact. But, as discussed, they provide
    no explanation for how the City should have meshed the 1996 simulation into the 2002 process,
    whether as a replacement or supplement for the low-fidelity video test, other testing components,
    or the entire process. Without that type of evidence, plaintiffs lose their argument that use of a
    high-fidelity simulation would produce better outcomes, because plaintiffs acknowledge that
    “[e]very single component of the 2002 testing process resulted in ‘very substantial’ adverse
    impact.”     (Third Br. at 34; see also First Br. at 23 (detailing the adverse impact of each
    testing component).)
    The plaintiffs likewise neglect to account for the City’s legitimate interests in test
    security and efficiency.            The 1996 simulation, which individually evaluated more than
    400 candidates’ law-enforcement techniques via two-hour role-play scenarios, required
    numerous actors to produce, lasted three weeks, and took two months to grade. (R. 648-6, Trial
    5
    Indeed, plaintiffs’ appellate briefing takes inconsistent positions regarding whether a low-fidelity video
    exam qualifies as a “practical test,” first arguing that it was the essential practical test for the 2000 process, and then
    arguing that the 2002 process lacked a practical test despite including a video exam. (Compare First Br. at 38–39,
    and Third Br. at 16, with Third Br. at 33.)
    Nos. 13-5452/5454            Johnson, et al. v. City of Memphis                                 Page 15
    Tr. (Jones) at 863–66.) Then the City discovered instances of candidate coaching, for which the
    plaintiffs prescribe no remedy, seemingly content with their expert’s unqualified assurance that
    the 1996 simulation would be “easily replicated” at a lesser cost than the 2002 process. (Third
    Br. at 35 (comparing the costs of the two processes: $79,250 for 1996, more than $400,000 for
    2002).) But the costs argument overlooks the cheating problems associated with the 1996 and
    2000 testing; the City hired outside consultants to design the 2002 process to insulate the exam
    from the potential biases of City employees. (See Second Br. at 14–15; R. 648-16, Trial Tr.
    (Claxton) at 2003.) And plaintiffs point to no evidence showing administration of a reliable
    simulation exercise to more than 500 candidates at a reasonable cost (time and money) and in a
    manner that minimizes the likelihood of candidate coaching or information leaking. The City’s
    expert report advised the parties in 2001 that simulations pose such problems, but when the City
    proposed a video test at status conferences before Judge McCalla, the plaintiffs expressed no
    qualms. (See R. 652-4, Jeanneret Rpt. at 38–39; R. 656-17, Status Conf. Hr’g Tr. at 28–32; R.
    60, 7/2/01 Status Conf. Order at 1–2; O.A. at 28:10–29:55, 31:50–32:05.6)
    At bottom, plaintiffs rest their proposal on the actual results of the 1996 simulation,
    stressing that it produced less racial disparity than the 2002 process. (See Third Br. at 35
    (comparing the 1996 simulation’s race-disparity score, d=.21, to that of the 2002 process,
    d=.83).) Yet, as the Seventh Circuit explained in Allen—and we agree—past practice alone does
    not 
    suffice. 351 F.3d at 315
    –17. The “[p]ast success” of a specific testing process “merely
    predicts, but does not establish, success” in future applications. 
    Id. at 315.
    This broadest of Title
    VII remedies—which requires no showing of discriminatory motive, see 
    Griggs, 401 U.S. at 431
    —demands evidence that plaintiffs’ preferred alternative would have improved upon the
    challenged practice. See 
    Allen, 351 F.3d at 315
    (“We cannot require the City to [incorporate
    plaintiffs’ alternative testing proposal based] on mere speculation.”); Zamlen v. City of
    Cleveland, 
    906 F.2d 209
    , 220 (6th Cir. 1990) (rejecting test-rescoring proposal, where plaintiffs
    offered only speculation of a less discriminatory impact). This is especially true here, where
    plaintiffs propose a cumbersome exercise with a track record of security problems, no objective
    measures of candidate performance, and no explanation for how it could fit into the 2002 process
    6
    Though the City’s consultants may not have examined the exact components of the 1996 process, the
    report and the parties’ discussions before the district court belie the plaintiffs’ claim that the City failed to
    investigate the possibility of using simulations.
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 16
    or why it would produce better outcomes. The one-off results of the 1996 simulation, without
    more, do not carry plaintiffs’ burden.
    Though arguably forfeited by plaintiffs’ minimalist briefing, the Chicago-plan and
    integrity/conscientiousness-testing proposals fare no better.         Again, plaintiffs offer no
    justification for their comparative validity or discriminatory effect, as compared to the
    2002 process’s testing features. We further note that the Chicago plan’s use of merit-review
    boards suffers from the same subjectivity and speculation problems identified by the Seventh
    Circuit in Allen. 
    See 351 F.3d at 315
    –17. As for integrity/conscientiousness testing, EEOC
    guidelines generally disfavor tests that measure abstract character traits by making inferences
    about candidates’ mental processes. See 29 C.F.R. § 1607.14(C)(1) (“A selection procedure
    based upon inferences about mental processes cannot be supported solely or primarily on the
    basis of content validity. Thus, a content strategy is not appropriate for demonstrating the
    validity of selection procedures which purport to measure traits or constructs, such as
    intelligence, aptitude, personality, commonsense, judgment, leadership, and spatial ability.”).
    Plaintiffs acknowledge as much. (Third Br. at 9.) With this in mind, the plaintiffs’ expert’s
    vague support for some sort of integrity/conscientiousness testing cannot demonstrate an equally
    valid, less discriminatory alternative. (See Third Br. at 29; R. 684-13, Trial Tr. (DeShon) at
    1681; R. 648-4, Trial Tr. (DeShon) at 670.)
    Ultimately, the district court aptly described plaintiffs’ proposed alternatives as “broad
    suggestions.”   No doubt, the 2002 process resulted in a substantially higher percentage of
    unsuccessful African-American applicants. But plaintiffs must offer more to establish a Title VII
    disparate-impact violation. Because plaintiffs failed to present evidence establishing a genuine
    issue of fact regarding the availability of equally valid, less discriminatory alternative testing
    methods, their step-three showing fails as a matter of law.
    Perhaps anticipating this outcome, plaintiffs offer an alternative defense of the district
    court’s Title VII judgment that assails the City’s step-two showing (credited by the district court)
    that the 2002 process was job-related and consistent with business necessity. See 
    Ricci, 557 U.S. at 578
    . Accordingly, we backtrack to the step-two standard.
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 17
    IV. PLAINTIFF’S ALTERNATIVE DEFENSE OF TITLE VII JUDGMENT:
    THE CITY’S STEP-TWO SHOWING
    “Once the plaintiff succeeds in making a prima facie disparate-impact case, the defendant
    may avoid liability by showing that the protocol in question has a manifest relationship to the
    employment.” 
    Davis, 717 F.3d at 494
    (citation and internal quotation marks omitted). The City
    may meet its step-two burden by showing through “professionally acceptable methods, [that its
    testing methodology is] predictive of or significantly correlated with important elements of work
    behavior which comprise or are relevant to the job or jobs for which candidates are being
    evaluated.” City of 
    Akron, 824 F.2d at 480
    (citation and internal quotation marks omitted).
    Courts often refer to a test’s job-relatedness and business necessity in terms of its “validity”—
    denoting the test’s relationship to relevant job content—and “reliability”—referring to its ability
    to produce consistent results. See, e.g., Guardians Ass’n of N.Y. City Police Dep’t, Inc. v. Civil
    Serv. Comm’n, 
    630 F.2d 79
    , 101 (2d Cir. 1980). When the employment position involves public
    safety, we accord greater latitude to the employer’s showing of job-relatedness and business
    necessity. 
    Chrisner, 645 F.2d at 1262
    –63 (finding sufficient support for an employer’s truck-
    driving experience requirements, noting that “[a]n industry with the primary function of
    managing the safety of large numbers of passengers must be allowed more latitude in structuring
    the requirements which could [a]ffect the performance of a primary business objective”); see
    also Spurlock v. United Airlines, Inc., 
    475 F.2d 216
    , 219 (10th Cir. 1972) (“[W]hen the job
    clearly requires a high degree of skill and the economic and human risks involved in hiring an
    unqualified applicant are great, the employer bears a correspondingly lighter burden to show that
    his employment criteria are job-related.”).
    The City used a “content validity” model for the 2002 process that tests a “representative
    sample of the content of the job.” 29 C.F.R. § 1607.14(C); accord Gonzales v. Galvin, 
    151 F.3d 526
    , 529 n.4 (6th Cir. 1998) (citing, as an example of a content exam, a secretary’s typing test).
    We recognize that a police department’s selection of testing criteria “is largely a matter within
    the professional judgment of the test writer based upon the particular attributes of the job in
    question.” Police Officers for Equal Rights v. City of Columbus, 
    916 F.2d 1092
    , 1099–1100 (6th
    Cir. 1990) (affirming the district court’s conclusion that job-relatedness “does not require precise
    proportionality” between the exam content and the relative importance of job tasks).
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 18
    A. District Court’s Validity Findings
    Here, in deeming the 2002 process’s testing methods valid, the district court detailed Dr.
    Jeanneret’s “comprehensive job analysis,” on behalf of the City, to identify the most important
    knowledge, skills, abilities, and personal characteristics (KSAPs) for the sergeant position.
    Jeanneret & Associates sought to assess all 44 of the important KSAPs identified
    in the job analysis and designed the test questions to meet the content validity
    requirements for the assessment. The investigative forms and other materials
    used in the investigative logic test and oral component were very similar to the
    actual materials used on the job and clearly simulated critical job duties.
    Additionally, all of the items on the job knowledge test were developed using the
    same reference materials used by MPD sergeants on the job. The investigative
    logic test involved realistic scenarios that were designed to simulate situations
    encountered and investigative activities performed by sergeants on the job.
    Likewise, the application of knowledge test was designed to evaluate how a
    candidate would respond to common situations encountered on the job. The
    [video-based] oral component also involved realistic scenarios designed to
    simulate situations in which a sergeant would be expected to use oral
    communication skills in responding to a superior officer, responding to the mother
    of a victim, and responding to a new partner.
    (R. 388, Bench Trial Op. at 17, 19–20.) Other than baldly saying that the tests did not measure
    traits relevant to the sergeant position (see Third Br. at 9)—arguments that appear to circle back
    to the claim that the 2002 process needed a work simulation instead of the video test—plaintiffs
    cite no evidence that contests the job-relatedness or representativeness of the KSAPs measured
    in each test component. We discern no clear error with these validity findings.
    B. District Court’s Findings Regarding Reliability & Rank Ordering
    Plaintiffs devote most of their alternative argument to the district court’s findings
    regarding reliability and rank ordering. On reliability, the court found:
    [The City’s expert and the designer of the 2002 process] Dr. Jeanneret
    testified that he did not include a reliability estimate in the validation report
    because the 2002 process was heterogeneous, i.e., it measured numerous broad
    KSAP dimensions that were correlated with one another, and he felt that there
    was no appropriate estimate of reliability. According to Dr. Jeanneret, the most
    appropriate approach to reliability for such a heterogeneous test was test-retest
    reliability, which was not feasible under the circumstances. A reasonable
    alternative, Dr. Jeanneret asserted, would have been to develop an alternate form,
    requiring two identical tests which, he believed, was not possible in light of the
    Nos. 13-5452/5454        Johnson, et al. v. City of Memphis                         Page 19
    particular testing environment. Since neither multiple administrations of the test
    nor parallel administration of identical tests were practicable, Dr. Jeanneret
    believed the only potentially applicable method of assessing reliability was to
    measure internal consistency using “coefficient alpha.” Dr. Jeanneret did not
    initially compute coefficient alpha because he intentionally designed a very
    heterogeneous test and making coefficient alpha, in his opinion, an inappropriate
    index of reliability.
    Both Dr. Jeanneret and [plaintiffs’ expert] Dr. DeShon subsequently
    measured coefficient alpha, using somewhat different methodologies.
    Dr. DeShon reported an overall reliability coefficient of .76 using a method
    known as stratified alpha. Dr. DeShon included seniority in his analysis, which
    Dr. Jeanneret testified was inappropriate because seniority was not part of the
    measurement process. (Jeanneret, Tr. Vol. 11, 1287–88; DeShon, Tr. Vol. 5, 575;
    Tr. Vol. 16, 1898, 1912.) The Court agrees that inclusion of seniority was
    inappropriate in assessing the reliability of the test. Since seniority was an
    administrative add-on component, there is no reason to expect that there would be
    a significant correlation or internal consistency between seniority and test items.
    Dr. Jeanneret eventually performed a reliability analysis using a “linear
    composite,” which resulted in a coefficient of .82. He also computed reliability
    using the formula for stratified alpha, which resulted in a coefficient of .83.
    The Court finds credible Dr. Jeanneret’s testimony as to the limited
    applicability of coefficient alpha in measuring reliability of a heterogeneous test
    which draws material for test items from multiple sources. The Court further
    finds that Dr. Jeanneret’s computations of stratified alpha without inclusion of
    seniority scores to be more appropriate than Dr. DeShon’s computation, which
    included seniority. Finally, the Court finds that Dr. Jeanneret’s conclusion that
    the 2002 process was sufficiently reliable is consistent with professional standards
    and is supported by relevant law. See Hearn v. City of Jackson, 
    340 F. Supp. 2d 728
    , 740–41 (S.D. Miss. 2003) (finding that a reliability coefficient of .79 is a
    common and acceptable value in the context of a heterogeneous test
    environment).
    (R. 388, Bench Trial Op. at 21–22 (transcript citations omitted).)
    On the subject of rank ordering, the court found:
    Under both Sixth Circuit precedent and the Guidelines, ranking of
    candidates is appropriate where it can be shown that a higher score correlates with
    higher job performance. See Williams v. Vukovich, 
    720 F.2d 909
    , 924 (6th Cir.
    1983); 29 C.F.R. § 1607.14(C)(9) (2006). The requirements for rank ordering can
    be met through a substantial demonstration of job-relatedness, variance in test
    scores, and an adequate degree of test reliability. Guardians Ass’n of New York
    City Police Dep’t, Inc. v. Civil Serv., 
    630 F.2d 79
    , 104 (2d Cir. 1980).
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 20
    As discussed above, the test content of the 2002 process was substantially
    job-related and there was an acceptable level of test reliability. Many sections of
    the test consisted of items in which there were several right answers, with
    differing point values for various elements, and/or opportunities for additional
    credit, all of which serve to distinguish better performing candidates from lesser
    performing candidates. (Def’s Ex. 22, pp. 43–46.) The written test was closely
    modeled after the like section in the 2000 process, which Dr. DeShon
    acknowledged was able to differentiate between those candidates with more job
    knowledge from those with less knowledge. (DeShon, Tr. Vol. 5, 546–47.)
    Additionally, the raw scores on the 2002 assessment show a substantial variance,
    with the highest raw score of 358.750 and the lowest of 174.750, among 517
    candidates. (Def’s Ex. 17.) See City of 
    Columbus, 916 F.2d at 1102
    –03
    (upholding rank ordering where score range was 40 points among 71 candidates).
    Based on the foregoing, the Court finds that rank ordering of the results of
    the 2002 process was proper, given that the test had an acceptable level of test
    reliability, was substantially job-related, and had substantial variance among
    the scores.
    (Id. at 22–23.)
    Plaintiffs lodge several objections to the reliability and rank-ordering findings, laced with
    a variety of counter-evidence in the opening of their response brief. (See Third Br. at 3–15, 44–
    62.) We distill three primary arguments: (1) that the district court incorrectly determined that Dr.
    DeShon incorporated seniority into his composite reliability score, and thus clearly erred in
    crediting Dr. Jeanneret’s reliability testimony; (2) that the district court applied the wrong legal
    standard for rank ordering, and the City failed to justify rank ordering by showing that higher test
    scores resulted in better job performance; and (3) that the district court erred by accepting the
    City’s use of seniority in the 2002 process. None demonstrates a reversible legal error or clearly
    erroneous factual finding.
    1. Dr. DeShon’s Non-Use of Seniority & the Court’s Credibility Finding
    First, plaintiffs deny the district court’s factual assertion that Dr. DeShon included
    seniority in his reliability calculations. The City appears to concede the inconclusive nature of
    the evidence cited by the district court (see Fourth Br. at 27–28), but notes that any error in this
    regard is harmless because both experts’ reliability scores (.76 from DeShon, .82–.83 from
    Jeanneret) fall within the range of reliability scores accepted by courts. See, e.g., Hearn, 340 F.
    Supp. 2d at 740 (approving of exam with .79 reliability coefficient). Yet any mistake regarding
    Nos. 13-5452/5454              Johnson, et al. v. City of Memphis                                     Page 21
    the constituent parts of Dr. DeShon’s composite reliability score (.76) leaves undisturbed the
    court’s remaining credibility determinations pertaining to Dr. Jeanneret’s reliability
    methodology and testimony—namely, its approval of (1) “Dr. Jeanneret’s testimony as to the
    limited applicability of coefficient alpha in measuring reliability of a heterogeneous test which
    draws material for test items from multiple sources,” and (2) his “conclusion that the 2002
    process was sufficiently reliable.” (R. 388, Bench Trial Op. at 21–22.)
    The court’s remaining conclusion—choosing Dr. Jeanneret’s reliability estimates (.82–
    .83) over that of Dr. DeShon (.76)—suffers only from the court’s mistaken belief that Dr.
    DeShon’s figure included seniority. So far as we can tell, plaintiffs accept the court’s related
    finding that these specific reliability calculations should not include seniority. Surprisingly, for
    all their complaints about Dr. Jeanneret’s methods, plaintiffs voice no concern for the higher
    result he achieved (.82 or .837) using their preferred calculation method, stratified alpha.
    Arguably, the district court selected Dr. Jeanneret’s number because it found his testimony more
    credible (consistent with its other credibility findings on this issue), not because it believed that
    Dr. DeShon made a calculation error. And even if the district court chose Dr. DeShon’s
    reliability number (.76), the district court cited authority approving a similar reliability
    coefficient. Hearn, 
    340 F. Supp. 2d
    at 740–41 (.79); cf. 
    Nash, 895 F. Supp. at 1548
    (stating that
    a reliability coefficient “above 0.70 is considered to be reliable”). Plaintiffs provide no authority
    compelling the conclusion that either a .76 or .82–.83 reliability score for this type of test fails as
    a matter of law.8
    Instead, plaintiffs charge that Dr. Jeanneret conceded the inappropriateness of his own
    reliability estimate.       To the extent plaintiffs suggest that Dr. Jeanneret rejected his own
    7
    Plaintiffs suggest in passing that Dr. Jeanneret did not know of “stratified alpha” and did not calculate it.
    (Third Br. at 52.) But Dr. Jeanneret explained that, though he initially lacked familiarity with the term “stratified
    alpha,” the “mathematics of the coefficient . . . [are] basically the same” as the “linear composite” figure he
    calculated. (R. 648-10, Trial Tr. (Jeanneret) at 1285–86.)
    We note that the cited evidence appears to invert the coefficient and stratified alpha scores (.83 and .82)
    noted by the district court and the City’s brief, but plaintiffs make no objection on this ground, and we have no
    reason to believe that the marginal difference between those two scores matters here.
    8
    Of course, we do not suggest that a reliability score of .70 suffices for all tests as a matter of law.
    Reliability determinations depend on the unique circumstances of the testing protocol. We simply acknowledge that
    this aspect of plaintiffs’ reliability argument asks us to determine credibility—something we cannot do. Harrison v.
    Monumental Life Ins. Co., 
    333 F.3d 717
    , 723 (6th Cir. 2003) (“Since we are not free to disregard the district court’s
    credibility assessment, the verdict must stand if [plausible evidence] supports [it.]”).
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 22
    calculations, they misread his testimony.       (See R. 648-12, Trial Tr. (Jeanneret) at 1507
    (acknowledging that his original report excluded a reliability coefficient, because it would not be
    an appropriate measure for the test, and stating his belief “that the coefficient alpha or internal
    consistency index of reliability [would not be] the most appropriate or even really an appropriate
    index for the reliability of the [2002 process]”).) As the district court noted, Dr. Jeanneret’s
    testimony explains the difficulty of calculating a reliability coefficient for a heterogenous test—
    i.e., one consisting of multiple, unrelated components that evaluate multiple tasks and
    characteristics. (See R. 648-10, Trial Tr. (Jeanneret) at 1273–81.) In choosing between the
    parties’ similar reliability estimates, the district court reasonably credited Dr. Jeanneret’s
    testimony that the best reliability measures—retesting candidates or administering duplicate
    tests—were impracticable for a process administered to more than 500 candidates. See, e.g.,
    Anderson v. City of Bessemer City, 
    470 U.S. 564
    , 573–74 (1985) (“If the district court’s account
    of the evidence is plausible in light of the record viewed in its entirety, the court of appeals may
    not reverse it even though convinced that had it been sitting as the trier of fact, it would have
    weighed the evidence differently.”).
    2. Rank Ordering
    Next, plaintiffs challenge the district court’s approval of the City’s use of rank ordering
    to distinguish between the candidates’ scores, arguing that the court misapplied three legal
    requirements for this scoring method set by this court in Police Officers for Equal Rights:
    (1) sufficient raw score spread (2) composite and component reliability, and (3) reasonable job
    analysis. Yet, as the City points out, our decision in Police Officers for Equal Rights included no
    such rule; it merely observed that the employer’s expert used those requirements. 
    See 916 F.2d at 1102
    . Our standard states that “[r]anking is a valid, job-related selection technique only where
    the test scores vary directly with job performance.” 
    Id. (quoting Williams
    v. Vukovich, 
    720 F.2d 909
    , 924 (6th Cir. 1983)).     The EEOC guidelines for content-validity studies support this
    approach:
    If a user can show, by a job analysis or otherwise, that a higher score on a content
    valid selection procedure is likely to result in better job performance, the results
    may be used to rank persons who score above minimum levels.
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                       Page 23
    29 C.F.R. § 1607.14(C)(9) (emphasis added). The City satisfies this likelihood threshold with “a
    substantial demonstration of job relatedness and representativeness,” score variance, and an
    “adequate degree” of test reliability. See 
    Guardians, 630 F.2d at 104
    ; see also Police Officers
    for Equal 
    Rights, 916 F.2d at 1100
    (explaining that, while a test should “measure important
    aspects of the job . . . for which appropriate measurement is feasible,” the job-relatedness
    requirement does not demand that the test “measure all [job] aspects, regardless of significance,
    in their exact proportions”).
    The City’s evidence clears this hurdle.
    a. Job-Relatedness
    First, the district court found that the City’s consultants conducted a “comprehensive job
    analysis” to identify the relevant KSAPs for the sergeant position, and that the test components
    measured relevant job tasks using similar materials to those used on the job and realistic law
    enforcement scenarios. (R. 388, Bench Trial Op. at 17, 19–20.) As noted above, the plaintiffs
    present no specific objection to these job-relatedness findings.
    b. Score Variance
    Second, the district court found “substantial variance” among the promotion scores: of
    the 517 tested candidates, the 2002 process yielded a raw-score point spread of 184 points
    between the highest and lowest candidates (358.75–174.75), out of a possible 384.5 points. (Id.
    at 23.) Our review of the exam results reveals no clear error in this finding. (R. 656-23, 2002
    Process Exam Results at 1–14.) Nor do we detect clear error in the court’s finding of significant
    variance. Cf. Police Officers for Equal 
    Rights, 916 F.2d at 1102
    –03 (permitting rank ordering
    where “[t]here was a spread of more than forty points among 71 test takers,” the highest score
    was 89.66, and the passing score was 70).
    Though plaintiffs stress that only one point separated approximately 30 of the more than
    500 candidate scores, that circumstance pales in comparison to the sort of score-bunching found
    problematic elsewhere.     See 
    Guardians, 630 F.2d at 103
    & nn.19–20 (finding insufficient
    reliability for rank ordering where nearly 9,000 applicants, or 2/3 of the passing scores, had
    scores between 94 and 97, out of 110 possible points). Moreover, the focus on promotional
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 24
    scores here exaggerates the 2002 process’s bunching effect, because the same candidates’ raw
    scores ranged between 303 and 341, or 79.0 and 88.7 on a 100-point scale. (See R. 656-23, 2002
    Process Exam Results at 3–4.) Varying seniority points (1–10) contributed significantly to this
    purported bunching problem.
    c. Reliability
    Third, the district court found sufficient test reliability, crediting Dr. Jeanneret’s
    composite reliability scores of .82–.83. Again, we find no clear error with the court’s factual
    findings and no error with its legal conclusion.
    Plaintiffs briefly mention that the individual components of the 2002 process received
    poor reliability scores ranging from .32–.79. Indeed, the relatively low component reliability
    scores give pause. See Police Officers for Equal 
    Rights, 916 F.2d at 1102
    (allowing rank
    ordering where the exam’s component tests achieved reliability scores ranging from .85–.97).
    Though the district court did not make specific findings regarding component reliability scores,
    plaintiffs point to no authority requiring such findings to sustain a rank-ordering test. Cf. 
    id. at 1103
    (holding that “the trial court was not clearly erroneous in accepting . . . [expert] testimony
    . . . on the issue of reliability and rank order scoring” that happened to include a component
    reliability estimate) (footnote omitted).
    “The district judge is entitled in questions of this kind which require expert [statistical]
    opinion to rely on that opinion.” 
    Id. So too
    here, where the district court relied on Dr.
    Jeanneret’s opinion that the heterogeneous nature of the 2002 process’s component tests made
    reliability coefficients less appropriate measures of reliability than other, impracticable methods,
    like test/re-test consistency or dual-test administration. (R. 388, Bench Trial Op. at 21–22.)
    And, as we said, both the plaintiffs’ expert and the City’s expert attained composite reliability
    figures greater than .75 regardless of any reliability problems with the component tests.
    Still, the plaintiffs argue that the City produced no evidence that the test scores vary with
    performance so as to justify rank ordering. See 
    Williams, 720 F.2d at 924
    . And, they add, high
    standard error measurements (SEM +3.64, +10.09 SED) belie the City’s claim of reliable test
    scores, rendering 428 of the 517 candidate scores statistically indistinguishable. Though the
    Nos. 13-5452/5454            Johnson, et al. v. City of Memphis                       Page 25
    district court’s opinion did not specifically address SEM or SED, neither of these claims
    undermines its finding that the City demonstrated sufficient reliability for rank ordering. With
    regard to likely test-score/job-performance correlation, Dr. Jeanneret’s supplemental report cited
    published industry principles asserting that “cognitively based selection techniques developed by
    content-oriented procedures . . . can usually be assumed to have a linear relationship to job
    behavior.” (R. 656-7, Jeanneret Resp. Suppl. Rpt. at 35 (acknowledging that the 2002 process,
    while not a cognitive-ability test, had cognitive components).) We also note as significant the
    district court’s finding—unchallenged on appeal—that the 2002 process’s “written test was
    closely modeled after the like section in the 2000 process, which Dr. DeShon acknowledged was
    able to differentiate between those candidates with more job knowledge from those with less
    knowledge.” (R. 388, Bench Trial Op. at 23 (citing R. 648-4, Trial Tr. (DeShon) at 546–47).)
    On the topic of SEM, plaintiffs offer no authority explaining why an SEM range of
    2.8 (Dr. Jeanneret’s corrected estimate calculated during trial) to 3.7, by itself, renders the
    2002 process inherently unreliable or trumps other measurements of reliability. They do not
    show, for instance, the sort of score-bunching and passage-rates deemed problematic by the
    Second Circuit in Guardians. 
    See 630 F.2d at 103
    & n.19 (finding unreliable a rank-ordered
    promotional test with an SEM of 2.4, explaining that the test “was too easy” and resulted in
    “8,928 applicants, two-thirds of all who passed, [with] bunched [scores] between 94 and 97” out
    of a possible 110 points).
    As for SED, Dr. Jeanneret’s supplemental report provides detailed reasons, supported by
    industry publications, for not relying on this measurement. (See R. 656-7, Jeanneret Resp.
    Suppl. Rpt. at 34–35.) Specifically, he opposes using large SED bands to equate broad ranges of
    test scores, explaining that SED bands “are calculated based on the normal probability
    distribution,” meaning that “the further apart two scores are, the more likely those scores are to
    be truly different.” (Id. at 34.) He elaborates, citing an industry publication finding that “even
    when a test is quite reliable, a typical SED band covers so large a part of the test score range that
    the preferred interpretation of banding advocates . . . is false.” Dr. Jeanneret goes on to note that
    “test score bands . . . try[ing] to account for measurement error . . . [are] not required, or even
    Nos. 13-5452/5454           Johnson, et al. v. City of Memphis                         Page 26
    endorsed by the professional standards in the field of industrial and organizational psychology
    (i.e., Principles, 2003; Standards, 1999).” (Id.)
    Ultimately, the district court heard the parties’ competing evidence regarding reliability,
    SEM, and SED, and the court found that the City justified the use of rank ordering with a
    substantial demonstration of job-relatedness, score variance, and an adequate degree of reliability
    supporting the likelihood that test scores would correlate to job performance. We find no clear
    error with the court’s findings of fact in this regard and no error with its ultimate legal
    conclusion regarding rank ordering.
    3. Seniority Scoring
    Last, plaintiffs denounce the City’s use and weighting of candidates’ seniority—an item
    included in their Memorandum of Understanding (MOU) with the officers’ union—as a
    promotional factor. The Supreme Court has held that a “bona fide seniority system [is not]
    unlawful under Title VII,” even though “a seniority system inevitably tends to perpetuate the
    effects of pre-Act discrimination.” Int’l Bhd. of Teamsters v. United States, 
    431 U.S. 324
    , 352–
    53 (1977) (construing 42 U.S.C. § 2000e-2(h)).         Thus, this court will sustain the seniority
    component of a promotional procedure “so long as an intent to discriminate did not enter into its
    adoption and it has been maintained free from any illegal purpose.” City of 
    Akron, 824 F.2d at 481
    .
    Though not quarreling with this standard, plaintiffs challenge the binding effect of the
    MOU on the City. But, contractual enforceability aside, without showing discriminatory intent
    or illegal purpose, plaintiffs have no grounds to impugn the City’s use of seniority. As for
    weighting, the plaintiffs suggest that the City’s scoring errors inflated seniority’s impact from an
    intended 10% to 25%. The cited testimony, however, appears to refer to something other than a
    tabulation error; Dr. DeShon differentiates between a “nominal weight” of 10% and an
    “effective” or “actual weight” of 25%, referring to the degree to which seniority affected
    promotion score variance. (R. 648-14, Trial Tr. (DeShon) at 1753–55.) Review of the test
    results (raw scores, scaled scores, and promotion scores) confirms this, revealing that seniority
    accounted for up to 10 points of the promotion score, out of a possible 110 points. (See
    generally R. 656-23.) Regardless of the nature of the alleged scoring error, in the absence of
    Nos. 13-5452/5454         Johnson, et al. v. City of Memphis                         Page 27
    evidence that the City’s weighting of seniority reflects a discriminatory intent or other illegal
    purpose, plaintiffs gain no ground. See City of 
    Akron, 824 F.2d at 481
    . Because the seniority
    component required no additional validation, the district court properly rejected this aspect of the
    plaintiffs’ challenge.
    V. CONCLUSION
    For these reasons, we affirm in part and reverse in part the district court’s judgment. We
    AFFIRM the district court’s immunity-based dismissal of plaintiffs’ negligence claim related to
    the 2000 process, but we REVERSE the district court’s Title VII judgment invalidating the 2002
    process, thereby MOOTING plaintiffs’ challenge to the district court’s choice of remedies for the
    2002 process.     We VACATE the district court’s fees award and REMAND for further
    consideration in light of these developments.
    

Document Info

Docket Number: 13-5452, 13-5454

Citation Numbers: 770 F.3d 464, 2014 FED App. 0271P, 2014 U.S. App. LEXIS 20644, 124 Fair Empl. Prac. Cas. (BNA) 1741, 2014 WL 5419935

Judges: Suhrheinrich, Gibbons, Cook

Filed Date: 10/27/2014

Precedential Status: Precedential

Modified Date: 11/5/2024

Authorities (25)

33-fair-emplpraccas-238-32-empl-prac-dec-p-33890-leonard-williams , 720 F.2d 909 ( 1983 )

Hearn v. City of Jackson , 340 F. Supp. 2d 728 ( 2003 )

Abel v. Dubberly , 210 F.3d 1334 ( 2000 )

police-officers-for-equal-rights-andrea-barrett-ronald-bosley-david , 916 F.2d 1092 ( 1990 )

Warren K. Harrison v. Monumental Life Insurance Company , 333 F.3d 717 ( 2003 )

barbara-zamlen-charleen-cuffari-sharon-pirosko-leana-adkins-jennifer , 906 F.2d 209 ( 1990 )

Beaven v. United States Department of Justice , 622 F.3d 540 ( 2010 )

Hershel CLADY, Et Al., Plaintiffs-Appellants, v. COUNTY OF ... , 770 F.2d 1421 ( 1985 )

Percy Allen, Yvette Clinkscale, Paul Gergoire v. City of ... , 351 F.3d 306 ( 2003 )

Mary R. CHRISNER, Plaintiff-Appellee, v. COMPLETE AUTO ... , 645 F.2d 1251 ( 1981 )

Kyle Ciminillo v. Thomas Streicher Daniel Hills Richard ... , 434 F.3d 461 ( 2006 )

russell-aiken-92-6154-william-ashton-92-6159-v-the-city-of-memphis , 37 F.3d 1155 ( 1994 )

23-fair-emplpraccas-909-23-empl-prac-dec-p-31154-the-guardians , 630 F.2d 79 ( 1980 )

Griggs v. Duke Power Co. , 91 S. Ct. 849 ( 1971 )

asarco-incorporated-v-secretary-of-labor-and-federal-mine-safety-and , 206 F.3d 720 ( 2000 )

Lloyd Bryant, Desmond Butler, Doris Byrd v. City of Chicago , 200 F.3d 1092 ( 2000 )

GIGGERS v. Memphis Housing Authority , 2012 Tenn. LEXIS 216 ( 2012 )

International Brotherhood of Teamsters v. United States , 97 S. Ct. 1843 ( 1977 )

Ricci v. DeStefano , 129 S. Ct. 2658 ( 2009 )

Nash v. Consolidated City of Jacksonville , 895 F. Supp. 1536 ( 1995 )

View All Authorities »