DocketNumber: No. 04-556C
Citation Numbers: 67 Fed. Cl. 384, 2005 U.S. Claims LEXIS 272, 2005 WL 2211951
Judges: Wolski
Filed Date: 7/29/2005
Status: Precedential
Modified Date: 10/19/2024
OPINION AND ORDER
This post-award bid protest is before the Court on cross-motions for judgment on the administrative record. Plaintiff Beta Analytics International, Inc. (“BAI” or “Beta”), was the incumbent performing the predecessor to the contract at issue, and challenges the award of a contract to intervenor Maden Tech Consulting, Inc. Many of the arguments made by BAI concern what may accurately be termed the “minutiae” of the process— such matters as whether it may appropriately be determined that proposals exceeded the solicitation’s requirements, a comparison of the judgment of different evaluators, and second-guessing a number of technical scores. But the protest ultimately presents the questions of whether the process followed by the Government, and the resulting evaluations, contained so many internal inconsistencies as to be arbitrary and capricious, and whether this resulted in prejudice to BAI. For the reasons that follow, the Court concludes that the answer to both questions is “yes.”
I. BACKGROUND
Beta, Maden Tech, and [XXXX] submitted proposals in response to Solicitation N0017403-R-0044 (“the Solicitation”), issued by the United States Department of the Navy on October 15, 2003. The Solicitation sought a contractor to provide intelligence support for the United States Department of Defense’s Defense Advanced Research Projects Agency (“DARPA”):
The Indian Head Division, Naval Sea Systems Command, Indian Head, Maryland (IHD/NAVSEA) has been tasked to provide Scientific, Engineering, and Technical Assistance (SETA) support services as required to support the Security and Intelligence Directorate (SID) of the Defense Advanced Research Projects Agency (DARPA) and other similar Government agencies such as DISA and ONR. In fulfilling this responsibility, SID/DARPA desires support services in developing, implementing and maintaining programs that facilitate the secure and successful accomplishments of the SID/DARPA mission while protecting DARPA personnel, information and property, which are consistent with DARPA Mission, Public Law, National Policy, applicable Executive Orders, and Department of Defense Directives and Regulations.
Admin. R. at 106 (Statement of Work (“SOW’) § 1.0).
A. The Source Selection
The Solicitation requested proposals for a contract to be awarded through a negotiated procurement. Admin. R. at 101. Offerors were instructed to submit a “Technical Proposal” covering the factors “Program Plan,” “Experience,” and “Personnel,” which were “listed in descending order of importance.” Admin. R. at 175-77. Detailed instructions were provided concerning the content of each of these factors and their sub-factors. Id.;
Offerors were informed that proposals would be evaluated based on the following factors, also “in descending order of importance”: Technical Proposal; Past Performance; and Cost/Price. The Solicitation provided that:
In determining best overall value, the Government will first assess an Offeror on the basis of Technical proposal then compare and rank Offerors on the basis of past performance. Then the Government will compare the tradeoffs between relative margins of technical ranking, performance and price. The offer [sic] who represents the best value will be the Offeror who represents the best tradeoff between technical excellence, superior performance and price.
Admin. R. at 186. Under the heading of “Methodology,” offerors were told that their submissions “shall be reviewed by the technical review team” concerning the three Technical Proposal Factors (Program Plan, Experience, and Personnel). Admin. R. at 188. The Solicitation further explained that “[e]ach factor shall be reviewed and assigned a score,” and that “[o]nce all evaluations are complete the corresponding scores shall be tabulated.” Id. After providing an explanatory table showing the factor scores, performance rating, and eost/price for fictitious offerors, the Solicitation explained:
Once this information is tabulated, Offerors will be compared making value and price tradeoffs and award will be made to the Offeror that represents the Best Value to the Government. If the Offeror with the highest scores also represents the lowest price then that Offeror is likely to be the Best Value.
Id. (emphasis added).
An undated “Source Selection Plan,” not divulged to prospective offerors, provided more detail on the evaluation process. See Admin. R. at 79-98. This plan provided for an evaluation panel made up of chairman Joe McClure, co-chairman Carla Little-Kopach, and members Neva Gartrell, Dorothy Aron-son, and Patrick Bailey. Id. at 79. The Source Selection Plan specified the scoring of the Technical Proposal factors, with the Program Plan factor worth a maximum forty-five points, Experience thirty points, and Personnel twenty-five points. Id. at 79, 83. A separate factor, Past Performance, was to be judged using a descriptor ranging from Poor to Excellent. Id. at 81-82.
Score sheets that were attached to the Source Selection Plan revealed the specific scoring for the sub-factors of the Technical Proposal factors. The Program Plan was allotted a maximum of twenty points for the over-all proposal, plus each of the five sub-factors (Risks Associated with Contract Performance; Measurement of Provided Services; Contract Transition; Staffing Plan; and Corporate Support) was worth a maximum of five points. Admin. R. at 85, 88-89. The Experience factor was broken down into six sub-factors, each worth a maximum five points: Counterintelligence; International Security; Communications Security; Information Security; Industrial Security; and
Each individual evaluator shall score each offer using the attached Individual Evaluation Sheets. Once all Offerors are evaluated, the Chairperson shall use the Chairperson’s Summary Sheet to enter the scores of the individual evaluators. These scores shall be averaged and that average shall he the score that Offeror receives. Past Performance shall be rated by the Contracts personnel based on information received from the references submitted by' each Offeror or based on the Government’s own knowledge based on experience with prior or current contracts with an Offeror. [¶] Once all scores are tabulated, the information shall be entered on the Decision Form and a comparative analysis performed to determine the Offeror that represents the best overall value to the Government. [¶] The Contracting Officer shall retain ultimate source selection authority.
Admin. R. at 83-84 (emphasis added).
The above-mentioned “Summary Sheet” has grids for each offeror, in which each panel member’s score for the Program Plan, Experience, and Personnel factors, as well as the total score, is contained. See id. at 96. The three individual evaluators were panel members Gartrell, Aronson, and Bailey. Their individual score sheets, and associated notes, are a part of the record. See Admin. R. at 1012-61 (Maden Tech scoring); id. at 1064-1111 ([XXXX] scoring); id. at 1112-1160 (BAI scoring). According to these individual score sheets, listed in order of importance (Program Plan, Experience, Personnel), Aronson gave BAI scores of 43, 30, and 24, for a 97 in total, see id. at 1148, 1154, 1158, and gave Maden Tech scores of 44, 26, and 23, for a total of 93. See id. at 1012, 1018, 1022. Gartrell gave BAI scores of 33, 22, and 15, for a total of 70, see id. at 1135, 1141, 1145, and gave Maden Tech scores of 43, 25, and 25, for a total score of 93. See id. at 1025, 1031, 1035.
Bailey, the third member of the team, gave BAI scores of 28, 27, and 23, for a total of 78, see id. at 1112, 1118, 1122, and gave Maden Tech scores of 40, 16, and 22, for a total also of 78. See id. at 1038, 1044, 1048. The low Program Plan score he gave BAI, however, was explained by the following note: “I marked risks associated with contract performance and contract transition as N/A [not applicable] which lowers the total possible base score to 35 of this section.” Id. at 1112. When the individual evaluators’ scores were entered on the Chairperson’s Summary Sheet, the BAI Program Plan score for Bailey is written down not as a 28 but as a 34, although it appears that the second digit was changed from a six to a four. See id. at 1004. According to this document, signed by the three evaluators and chairman McClure on December 11, 2003, the average scores of BAI were 36.66 for Program Plan, 26.33 for Experience, and 20.66 for Personnel, for a total of 83.66 and a rounded, final average score of 84. Id. Maden Tech received average scores of 42.33, 22.33, and 23.33, for a final averaged score of 88. Id.
A memorandum dated January 26, 2004, and signed by chairman McClure and Contracting Officer Penny S. Kennedy, contained the determination that Maden Tech offered the best value to the Government and recommended that it should be awarded the contract. See Admin. R. at 989, 1003. The first page of this memorandum contains a table, in which the Technical Proposal score, the Past Performance rating, and the Cost/ Price for each of the three offerors is listed. The Technical Proposal scores are identical to the rounded, final average scores from the Chairperson’s Summary Sheet. Compare Admin. R. 989 with id. at 1004. These same scores are repeated in the table on the Decision Form signed by the Contracting Officer and dated January 22, 2004. Id. at 1005.
This use of the average of the individual evaluators’ scores is consistent with the Source Selection Plan, which stated that “[t]hese scores shall be averaged and that average shall be the score that Offeror re
For the Program Plan over-all proposal (the part worth 20 points), the only comments in the discussion of Maden Tech’s and BAI’s proposals, other than the summary/excerpts, appears to be the characterization of the proposals and some of their aspects as “comprehensive.” Compare Admin. R. at 989-90, 993-94 with id. at 326-35, 876-88. Both BAI and Maden Tech received 19 of the 20 available points for this portion, directly corresponding with the average of the individual evaluators’ scores. See Admin. R. at 989, 993, 1012, 1025, 1038, 1112, 1135, 1148.
The memorandum also showed Maden Tech receiving, as the score for each sub-factor of the Program Plan factor, the average of the individual evaluators’ scores for each sub-factor. All five sub-factor narratives include a brief summary description and excerpts, but only three of the five include any sort of analysis or comment. The “Risks Associated with Contract Performance” paragraph called a [XXXX] “comprehensive” and ended: “This along with their narrative provided convincing evidence that the risk would be very low if they were awarded the contract.” Admin. R. at 990. The “Measurement of Provided Services” paragraph concluded, “[t]heir response was very comprehensive and complete and provides a high level of confidence in their ability to measure and improve on their process and services to achieve the requirements and goals of the SOW.” Id. And the “Contract Transition” paragraph called a [XXXX] “comprehensive” and states that Maden Tech “has been very successful” in performing transition duties, although this is presumably based on the company’s own account of its experience and accomplishments. See id. at 990-91.
Unlike the section on Maden Tech, the memorandum’s narrative on BAI’s Program Plan sub-factors does not correspond to the average scores given by the individual evaluators. This is partly due to Bailey’s decision that “not applicable” sub-factors should not be scored. On Bailey’s score sheets, he wrote “N/A” rather than any number for the score under “Risks Associated with Contract Performance,” Admin. R. at 1115, and “Contract Transition.” Id. at 1116. Thus, only two evaluators scored these categories. For the former, Aronson gave BAI a five,
The comment portion of each paragraph on the BAI sub-factors was more extensive than for Maden Tech. For instance, after one sentence that summarizes BAI’s first paragraph in response to the “Risks Associated with Contract Performance,” the memorandum reads:
The rest of the explanation defined the risk of another offeror taking over and risk involved with it. The proposal did not address any specific risks, though some may exist even with the incumbent (complacency for example), the program plan could have addressed some potential risks for contractor performance and how they planned on mitigating these risk [sic].
Admin. R. at 994.
This method and process does not address [XXXX].
Id. at 995. After repeating nearly the entire BAI response for the “Staffing Plan” sub-factor, compare id. with id. at 890, the memorandum adds: “The plan was actually a statement and lacked explanations for recruitment and replacement of current employees and capabilities to respond to surges in work requirements, etc.” Admin. R. at 995.
Most curious, however, is the treatment of BAI’s “Corporate Support” proposal. Beta’s first paragraph is nearly reproduced verbatim. Compare Admin. R. at 995 with id. at 890. Beta’s second paragraph, which begins with the words “More importantly” and identifies a particular individual as an example of “a value added asset” available for the contract, is ignored. So, too, is BAI’s third paragraph, which describes how its [XXXX] has assisted in [XXXX], See id. at 890; see also id. at 880 (describing how the [XXXX] functions). But the [XXXX] was not only mentioned prominently as a “unique management tool” in the memorandum’s Program Plan narrative, see Admin. R. at 994, it was specifically noted in the comments of every individual evaluator on this topic. See id. at 1117, 1140, 1153. The memorandum ignores these specifics and concludes its paragraph on the sub-factor: “The support as described in the Program Plan is adequate, but normal support we would expect from any contractor.” Id. at 995.
The narratives for the next factor in importance, “Experience,” contain even less commentary on the evaluated sub-factors. Following each brief summary or excerpts from an offeror’s proposal is just one sentence of evaluation. When four out of a possible five points were given for a sub-factor, this sentence would read that the exhibited experience “meets all of the requirements of the statement of work.” See Admin. R. at 992-93, 996-97. When the maximum five points were awarded, after stating that all requirements of the SOW were met, the sentence also contained the conclusion that the experience “in some cases exceeds the requirements.” See id. at 993, 996-97. For two sub-factors, Maden Tech received three points. The evaluation sentence for the first read, “Maden Tech’s experience in [XXXX] is limited, but does demonstrate adequate knowledge of the requirements.” Id. at 992. Concerning the second, the memorandum read: “Maden Tech had only limited experiences with some requirements ... but it did demonstrate the
For the “Personnel” factor, the scores assigned to the offerors in the memorandum were identical to the average of the individual evaluators’ scores. See Admin. R. at 993, 997, 1002, 1004. Maden Tech received 23 out of 25 points, justified as follows:
Maden Tech’s personnel matrix and writeup on their key personnel demonstrated all of the proposed personnel meet the qualifications of their respective labor category referenced in Section C of the RFP by submitting current information about their work experience, education and whether the person is presently employed with the Offeror, currently employed by a proposed subcontractor, or proposed under letters of intent. All letters of intent were provided with personnel matrix. Maden Tech had at least one key person for every position with the exception of the Senior Security Analyst and International Security Specialist; however, the Maden Tech employed a majority of the key personnel.
Id. at 993. Beta received 21 points, id. at 997, as the Navy determined:
The proposed key personnel satisfactorily meet the minimum qualifications for education and experience and demonstrate knowledge and capability to perform the requirements in the Statement of Work. A majority of the key personnel are currently employed with the prime contractor with one key person providing a letter of intent. The non-key personnel proposed satisfactorily meet the personnel qualifications listed in the solicitation.
Id. at 998. As was the case with the other factors, these “Personnel” narratives have little in common with the comments the individual evaluators made on their score sheets. See, e.g., Admin. R. at 1022-23, 1035-37, 1048-55, 1122-24, 1130-34, 1145-47, 1158-60.
The value and price tradeoffs, as was noted above, were based on the total score for each offeror’s Technical Proposal, and made no mention of any differences among the specific features of these proposals. Following the Solicitation, see Admin. R. at 175, 186, the memorandum stated that “[t]he technical evaluation is the most important factor and price is the least important factor.” Id. at 1003. Concerning the determination at issue in this case, the memorandum provided:
Maden Tech with a technical evaluation score of 88, an excellent past performance and a price of $[XXXX] was determined to be a better value than BAI with a technical evaluation score of 84, an excellent past performance and a higher price of $[XXXX],
Id. Comparing Maden with [XXXX], it was noted that [XXXX]’s technical score was “in the marginal range compared to Maden
The contract was awarded to Maden Tech. A “Business Clearance Memorandum” (“BCM”) dated March 4, 2004, submitted by Contracting Officer Kennedy and Contract Specialist Michael L. Burch, and signed by three members and the chairman of a Navy “Contract Review Board,” is included in the record. See Admin. R. at 1212.
The BCM also contains the Contracting Officer’s Best Value determination, using language that is virtually identical to the tradeoff discussion in the January 26, 2004 memorandum. Compare Admin. R. at 1234 with id. at 1003. One difference is the insertion of the sentence, “[biased on the comparison of offers Maden Tech Consulting, Inc. provides the best opportunity for success and is the ‘best value’ for this award.” Id. at 1234. Another is the addition of the introductory phrase “[biased on the evaluation of the contractor’s technical factors” to the trade-off sentence discussing Maden Tech and [XXXX], Id.
During the hearing on the motions for judgment on the administrative record, another portion of the record was highlighted. Without any explanation, the record contains a second set of individual evaluations of Ma-den Tech’s proposal, by Aronson, Gartrell, and a third person named Riva Meade, who was not a member of the evaluation panel. Admin. R. at 1161-91. On this other set of score sheets, Aronson gave Maden Tech a Program Plan score of 41, which is three points lower than her score of 44 that was used to award the contract. Compare id. at 1012 with id. at 1161-65. Her Personnel score on this other set of score sheets was a 25, two points higher than the score used in the award. Compare id. at 1022 with id. at 1169. Gartrell gave Maden Tech a Program Plan score of 37, which was six points lower than the 43 used in the contract award. Compare id. at 1025 with id. at 1171-75. She gave the same Personnel score of 25 each time. Id. at 1035, 1180.
Perhaps even more unusual than the inclusion of score sheets used by Meade is the content of the score sheets used to evaluate Experience in this second set. The six sub-factors listed do not correspond with the Solicitation or the Source Selection Plan. Instead of covering the sub-factors of counterintelligence support, international security, communications security, information security, industrial security, and program security support, the sheets instead ask the
B. Procedural History
The Navy determined that Maden Tech’s proposal offered the best value to the Government, and awarded it the contract in mid-March, 2004. See Admin. R. at 1211-12, 1231, 1539. On April 2, 2004, BAI filed a complaint and a motion for a preliminary injunction, after learning that its technical evaluation score was within four points of Maden Tech’s score, while its price was $452,626.97 higher than Maden Tech’s. Beta argued that its bid was scored incorrectly, and that if scored correctly it would have received a higher score than Maden Tech and thus could have been awarded the contract. It also argued that bias against it played a part in the decision. Without the benefit of an administrative record, which was not filed until the following week, the Court held a hearing on BAI’s motion for a preliminary injunction on April 7, 2004, and denied the motion at the hearing’s conclusion. Shortly thereafter, BAI’s contract came to an end and Maden Tech began performing under the contract at issue.
Beta subsequently requested leave to conduct discovery to supplement the administrative record. That request was denied June 30, 2004. Beta, 61 Fed.Cl. at 228. After the government and Maden Tech filed separately their motions for judgment on the record pursuant to Rule 56.1 of the Rules of the United States Court of Federal Claims (“RCFC”), BAI moved for leave to amend its complaint. The only change to the original complaint was the replacement of the sixth count, alleging improper bias against BAI in the procurement process, with a specific allegation that the Navy failed to make a conflict of interest determination in violation of 48 C.F.R. § 9.504(e) (2004). Compare Compl. ¶ 86 with Am. Compl. ¶¶ 86-88. As this claim had been raised in BAI’s motion for leave to conduct discovery, the Court determined that the amendment would not be prejudicial to the Government or to Maden Tech. Order (July 30, 2004). Finding BAI’s delay in moving to file the amended complaint (three months since receiving the administrative record, and three weeks since its opponents filed their motions for judgment) not to be excessive, the Court granted it leave to file the amended complaint. Id. Beta filed a cross-motion for judgment on the record, and the Court held a hearing on the motions after the briefing had been completed.
II. DISCUSSION
Beta argues that the technical evaluation process that resulted in its score of 84 and a score of 88 for Maden Tech was arbitrary, capricious, and unlawful. In particular, BAI contends that the Navy evaluators misapplied or ignored the evaluation criteria; improperly used criteria or factors that were not disclosed to the offerors; did not adequately document their evaluation decisions; and without justification departed from the Source Selection Plan by evaluating Maden Tech’s proposal twice. Beta also claims that the Navy violated 48 C.F.R. § 9.504(e) by
A. Standard of Review
Post-award bid protests are heard by this Court under the Tucker Act, as amended by the Administrative Dispute Resolution Act of 1996 (“ADRA”), Pub.L. No. 104-320, §§ 12(a)-(b), 110 Stat. 3870, 3874 (1996). 28 U.S.C. § 1491(b)(1) (2000). This provision requires our Court to follow the Administrative Procedure Act (“APA”) standards of review: “In any action under this subsection, the courts shall review the agency’s decision pursuant to the standards set forth in section 706 of title 5.” 28 U.S.C. § 1491(b)(4). The APA standards, incorporated by reference, provide that a:
reviewing court shall ... (2) hold unlawful and set aside agency action, findings, and conclusions found to be — [IT] (A) arbitrary, capricious, an abuse of discretion, or otherwise not in accordance with law; [11] (B) contrary to constitutional right, power, privilege, or immunity; [11] (C) in excess of statutory jurisdiction, authority, or limitation, or short of statutory right; [11] (D) without observance of procedure required by law; [II] (E) unsupported by substantial evidence in a case subject to sections 556 and 557 of this title or otherwise reviewed on the record of an agency hearing provided by statute; or [H] (F) unwarranted by the facts to the extent that the facts are subject to trial de novo by the reviewing court. In making the foregoing determinations, the court shall review the whole record or those parts of it cited by a party, and due account shall be taken of the rule of prejudicial error.
5 U.S.C. § 706.
Based on an apparent misreading of the legislative history, see Gulf Group, Inc. v. United States, 61 Fed.Cl. 338, 350 n. 25 (2004), the Supreme Court had determined, before the 1996 enactment of the ADRA, that the de novo review standard contained in 5 U.S.C. § 706(2)(F) does not usually apply in review of informal agency decisions — decisions, that is, such as procurement awards. See Citizens to Preserve Overton Park, Inc. v. Volpe, 401 U.S. 402, 415, 91 S.Ct. 814, 28 L.Ed.2d 136 (1971). Instead, courts are to employ the standard of 5 U.S.C. § 706(2)(A): whether the agency’s acts were “arbitrary, capricious, an abuse of discretion, or otherwise not in accordance with law.” See Overton Park, 401 U.S. at 416, 91 S.Ct. 814 (citation omitted); see also Camp v. Pitts, 411 U.S. 138, 142, 93 S.Ct. 1241, 36 L.Ed.2d 106 (1973). The “focal point for judicial review should be the administrative record already in existence,” id., and this applies even where, as here, the matter being reviewed was not the product of a formal hearing. See Fla. Power & Light Co. v. Lorion, 470 U.S. 729, 744, 105 S.Ct. 1598, 84 L.Ed.2d 643 (1985).
A motion under RCFC 56.1 for judgment on the administrative record differs from a motion for summary judgment under RCFC 56, as the existence of genuine issues of material fact does not preclude judgment under the former. Compare RCFC 56.1(a) (incorporating only the standards of RCFC 56(a)-(b)) with RCFC 56(c) (containing the disputed facts standard); see Bannum, Inc. v. United States, 404 F.3d 1346, 1355-56 (Fed.Cir.2005). A motion for judgment on the administrative record examines whether the administrative body, given all the disputed and undisputed facts appearing in the record, acted arbitrarily, capriciously, or contrary to law. See Arch Chems., Inc. v. United States, 64 Fed.Cl. 380, 388 (2005); Gulf Group, Inc., 61 Fed.Cl. at 350; Tech Sys., Inc. v. United States, 50 Fed.Cl. 216, 222 (2001). If arbitrary action is found as a matter of law, the Court will then decide the factual question of whether the action was prejudicial to the bid protester. See Bannum, 404 F.3d at 1351-54.
Under the “arbitrary and capricious” or “abuse of discretion” standard of review, the court must consider whether the decision was based on a consideration of the relevant factors and whether there has been a clear error of judgment. Although this inquiry into the facts is to be searching and careful, the ultimate standard of review is a narrow one. The court is not empowered to substitute its judgment for that of the agency.
Because of the deference courts give to discretionary procurement decisions, “the disappointed bidder bears a ‘heavy burden’ of showing that the award decision ‘had no rational basis.’” Impresa Construzioni, 238 F.3d at 1333 (quoting Saratoga Dev. Corp. v. United States, 21 F.3d 445, 456 (D.C.Cir.1994)); see also Cont’l Bus. Enters., Inc. v. United States, 196 Ct.Cl. 627, 452 F.2d 1016, 1021 (1971) (noting the “unusually heavy burden of proof’ when a technical determination is challenged). The evaluation of proposals for their technical excellence or quality is a process that often requires the special expertise of procurement officials, and thus reviewing courts give the greatest deference possible to these determinations. See E.W. Bliss Co. v. United States, 77 F.3d 445, 449 (Fed.Cir.1996); Arch Chems., 64 Fed.Cl. at 400; Gulf Group, 61 Fed.Cl. at 351; Overstreet Elec. Co. v. United States, 59 Fed.Cl. 99, 102, 108, 117 (2003). As the Federal Circuit has explained, challenges concerning “the minutiae of the procurement process in such matters as technical ratings ... involve discretionary determinations of procurement officials that a court will not second guess.” E.W. Bliss Co., 77 F.3d at 449; see also Banknote Corp. of Am. v. United States, 56 Fed.Cl. 377, 384 (2003) (determining that “naked claims” of disagreement with evaluations “no matter how vigorous, fall far short of meeting the heavy burden of demonstrating that the findings in question were the product of an irrational process and hence were arbitrary and capricious”).
The presence (by the government or intervenor) or absence (by the protester) of any rational basis for the agency decision must be demonstrated by a preponderance of the evidence. See Gulf Group, 61 Fed.Cl. at 351; Overstreet, 59 Fed.Cl. at 117; Info. Tech. & Appl’ns Corp. v. U.S., 51 Fed.Cl. 340, 346 (2001) (citing GraphicData LLC v. United States, 37 Fed.Cl. 771, 779 (1997)), aff'd, 316 F.3d 1312 (Fed.Cir.2003).
An additional ground for a contract award to be set aside is when the protester can show that “the procurement procedure involved a violation of regulation or procedure.” Impresa Construzioni, 238 F.3d at 1332.
B. Was BAI Potentially Prejudiced by the Challenged Actions?
Before reaching the merits, a threshold question that must be addressed is whether the bid protester was potentially prejudiced by the decisions it challenges. Info. Tech., 316 F.3d at 1319. This question comes first because the standing of disappointed bidders turns on it. Id.
Beta’s allegations satisfy this initial determination of potential prejudice. If the various challenges to the Technical Proposal scoring were well-founded, these would be more than adequate to bridge the four-point gap between BAI’s score of 84 and Maden Tech’s score of 88. Beta thus could have had a substantial chance at the award, as cost was the least important evaluation factor. See Admin. R. at 186, 188.
C. Was the Navy’s Evaluation of Offers Arbitrary and Capricious?
Before turning to the specific claims of arbitrary or unlawful action in the evaluation of offers for this contract, the Court must first determine precisely what it may review concerning a procurement decision, and how it may review it. It is well-settled, of course, that the Court is not to second-guess the discretionary judgments of evaluators. But to decide if the Navy made “a clear error of judgment,” Overton Park, 401 U.S. at 416, 91 S.Ct. 814, the Court must first determine whose judgment counts. To decide whether the Navy “considered the relevant factors and articulated a rational connection between the facts found and the choice made,” Balt. Gas & Elec. Co. v. Natural Res. Def. Council, 462 U.S. 87, 105, 103 S.Ct. 2246, 76 L.Ed.2d 437 (1983) (citation omitted), requires one to identify the facts and the choice in question.
This determination is not so obvious under the circumstances presented in this case. The ultimate choice was the decision that Maden Tech’s proposal offered the Government the “best value.” Under the Source Selection Plan, the Contracting Officer “retain[ed] ultimate source selection authority.” Admin. R. at 84; see also 48 C.F.R. § 15.308. In the BCM, she wrote that she “coneur[red] with the award to Maden Technologies, Inc. as detailed in this business clearance.” Admin. R. at 1233. She made this award based on a comparison of the Technical Proposal scores, Past Performance ratings, and proposed price of each offeror. See id. at 1233-34. By that time, however, the Technical Proposal evaluations were boiled down to just one number for each offeror — an 88 for Maden Tech, an 84 for BAI, and a[XX] for [XXXX], Id.
Clearly, Court review is not limited to this choice and these facts — that 88 is larger than 84 and $[XXXX] is smaller than $[XXXX]. All parties agree that what is important is that there be a reasoned explanation for the 88 and the 84. At the hearing, Maden Tech took the position that the BCM is the “decision that we should focus on,” Tr. (Sept. 1, 2004) at 225, and that it reflects the Contracting Officer’s analysis and scores. Id. at 225-26. The Government took the position that the January 26, 2004 memorandum to the file provided the requisite narrative explaining the evaluation. See, e.g., id. at 134-35, 163.
But there is nothing in the administrative record to support the allegation that there was any meeting of the evaluation panel. The January 26, 2004 memorandum identifies
Without a law, a regulation, or the Source Selection Plan requiring a consensus meeting as a predicate to these narratives, the presumption of regularity does not support the occurrence of such a meeting. See Tecom, Inc. v. United States, 66 Fed.Cl. 736, 769-70 (2005). And nothing in the narratives themselves lends credence to this allegation — as they consist primarily of brief summaries or excerpts from each proposal, and their sparse commentary rarely, if ever, resembles the individual evaluators’ comments from the score sheets.
On the other hand, BAI argues that the Court should scrutinize each individual evaluator’s score sheets and notes, looking for a reasoned explanation for every score of every factor and sub-factor. See, e.g., Tr. at 49, 54-55, 69-70, 72-73, 84-86; see also Am. Mem. Supporting Pl.’s Mot. J. Admin. R. (“Pl.’s Mot. Mem.”) at 20-21, 24-25. Beta would have the Court determine arbitrariness in scoring by comparing the notes and scores of different evaluators, see, e.g., Tr. at 58, 65-68, 71-72, and by comparing individual evaluators’ score sheets to the January 26, 2004 memorandum. Id. at 61-63. Thus, if the Source Selection Plan provides that, to get the maximum score in a category, an offeror must show A and B and C, an evaluator must explain why he or she found A and B and C; and this decision can be, in essence, impeached by another evaluator’s decision that one of these elements is missing. What this approach has to recommend it is the fact that the Technical Proposal scores, which were the basis for the best value determination, were supposed to be derived directly by averaging the scores of these individuals. See Admin. R. at 83 (“These scores shall be averaged and that average shall be the score that Offeror receives.”) (emphasis added). The Technical Proposal scores — which were the most important factors in the award decision, see Admin. R. at 79, 186 — are entirely dependent on the judgments of each and every individual evaluator.
But such close scrutiny risks entangling the Court in “the minutiae of the procurement process,” E.W. Bliss Co., 77 F.3d at 449, and displacing deference with second-guessing. This dilemma may well be inherent in the competitive procurement process. Competition among potential suppliers of goods and services is expected to lead to higher quality and lower prices. See Arch Chems., 64 Fed.Cl. at 400 & n. 41. Price competition is plain to see, and objectively easy to measure. But there are many goods and services that are not fungible, and procuring suitable quality is often a concern. Just as private consumers do not typically choose the cheapest automobile or physician, but also evaluate their options based on quality, we often expect the same calculus to be made when certain public goods and services are obtained by our governments. Hence, “best value” determinations are made.
Evaluating quality and making price/quality tradeoffs, however, involves judgment, which introduces discretion, which in turn raises the specter of arbitrary or biased decision-making. Curtailing the latter is one of the goals of using competition in the first place, so that taxpayers get their money’s worth and potential bidders are not deterred by fear of unfair processes. Thus, elements of transparency are built into the process, by statute and by regulation. Potential bidders
This transparency, of course, readily lends itself to APA-type review of the source selection decision, in theory facilitating a Court’s verification that the relevant factors were considered, and that a reasoned connection exists “between the facts found and the choice made.” Balt. Gas & Elec. Co., 462 U.S. at 105, 103 S.Ct. 2246 (citation omitted). But it also aims at making more articulate the sort of judgments that are often tacitly made in the private sector. Although the most complicated, multi-factor equations may be churning in the mind of a typical consumer, rare is the person who sits down and attempts to assign qualitative scores to the various features of, say, the automobiles they test-drove, let alone attempt to write a narrative on each. In an effort to make the procurement process as objective as possible, the Navy utilized evaluators who were to give hard numbers to the qualities offered by the three offerors, hard numbers which were then averaged and used as data in the decision. Paradoxically, this effort may have made the process seem less, not more, objective.
In the first place, an average is not the product of reasoning, but is instead the result of a mathematical process. If on a scale of zero to five, one evaluator scores a feature a “one,” another scores it a “two,” and the third gives it the maximum five points, the rounded average comes to a three. Two of the three evaluators thought it poorer than average, and if consensus were necessary they might have carried the day. But in any event the “three” was not derived from the belief that the feature was average, and indeed not one of the evaluators in this example thought so. The Navy bound itself to the average of the evaluators’ scores for the three Technical Proposal factors, although each average is a number with no reason behind it. This shortcoming was to be rectified by the narratives employed in the January 26, 2004 memorandum and the BCM. As is noted above, these narratives do not appear to have been written by the actual evaluators whose scores determined the averages used. But even if they were, this sort of “score first, give reasons later” approach would be merely supplying a rationalization for the non-rational.
A second problem is that, even if the averages were considered to be the result of reasoning, this reasoning was performed at the individual evaluator level — not when the narratives were written. A Government agency cannot merely treat the numbers that emerge from the evaluation process as a fact to be explained or interpreted, for this may lead to the result, or at least the suspicion, that arbitrary decisions are being laundered by the narrative writer. To the extent that the Government’s approach to review would place the score sheets of individual evaluators beyond scrutiny, it must be rejected. But this leads the Court right back into the “minutiae” thicket. Given the reliance on the average scores, how much documentation must there be, for each separate score of each individual evaluator, for the process to be rational?
At one extreme, if evaluators provided nothing but numbers, from a range of nothing but numbers, it would be hard to determine a rational connection between “facts” (such as the proposal features) and the scores. In the middle of the spectrum might be a methodology in which a scoring range has corresponding criteria, but all the evaluator produces is a number. If one needed to find the presence of A, B, and C in order to give a score of “ten,” and a “ten” is given, this is tantamount to writing down “I find A, B, and C.” And at the other extreme would
This case presents a methodology close to the middle of the above-described spectrum. The score sheets used by the individual evaluators provided fairly detailed scoring criteria for the Personnel factor and the over-all Program Plan evaluation. See Admin. R. at 94-95; id. at 86. For the Program Plan sub-factors, and the Experience sub-factors, the scoring criteria were as follows: “Maximum Points: Meets every aspect of the requirement and exceeds the requirement in some areas with no exceptions or weaknesses. Minimum Points: Does not meet the requirement and did not demonstrate ability to meet the requirement.” See id. at 88, 91. The score sheets also contained areas for the evaluators to write down notes. See id. at 85-95.
Since the Source Selection Plan mandated that the “average shall be the score” for each offeror, id. at 83 (emphasis added), the individual evaluations must be subject to review. Even though the ultimate contract award decision rested in the hands of the Contracting Officer, she was not authorized to use Technical Proposal scores other than the average scores to make this decision. Cf. Bowman Transp., 419 U.S. at 288 n. 4, 95 S.Ct. 438 (describing proceeding in which agency was empowered to determine facts differently from its evidentiary examiners). Thus, questions about whether scoring was arbitrary require the Court to look to the individual evaluations to “discern ... a rational basis for [the Navy’s] treatment of the evidence.” Id. at 290, 95 S.Ct. 438. Of course, the narratives in the two memoranda must also be reviewed when questions as to errors or inconsistencies in the Contracting Officer’s use of the average scores are raised.
In reviewing the individual evaluations, there is less need for detailed notes describing an evaluator’s conclusion when the scoring criteria are well-defined. An evaluator’s notes and his or her corresponding score may, however, be compared for consistency — for example, if A must be present to award the maximum score, and the evaluator writes that A is absent, this would be inconsistent with an award of the maximum score for that sub-factor. When the scoring criteria are less precise, there might be a greater need for notes to discern the basis for that aspect of the evaluation. And the extent that evaluators made notes also has a bearing on whether any particular narrative may be relied upon to discern a rational basis for the decision, for the more detailed are the notes, the more reasonable is the assumption that the author of the narrative was working not as a creator but a compiler. And finally, the amount of deference given to a particular decision will depend on how objective the associated criteria for scoring happened to be. The Court should not be in the business of comparing different evaluators’ subjective judgments to determine if one or the other were mistaken, but can look to see if a necessary element for a score is missing. Having set the ground rules, the Court will now turn to BAI’s contentions.
1. Failure to Disclose the Precise Weight of Factors and Sub-factors
Beta’s first argument addresses the entire evaluation process.
All factors and significant subfactors that will affect contract award and their relative importance shall be stated clearly in the*400 solicitation (10 U.S.C. 2305(a)(2)(A)(i) and 41 U.S.C. 253a(b)(1)(A)) (see 15.204-5(c)). The rating method need not be disclosed in the solicitation.
48 C.F.R. § 15.304(d).
In Section M of the Solicitation, it is clearly stated that “[proposals will be evaluated and rated against the factors listed below, in descending order of importance,” followed by a list in which Technical Proposal comes first, Past Performance second, and Cost/Price third. Admin. R. at 186 (emphasis added). Concerning the Technical Proposal factors, the Solicitation stated these were, “In descending Order of Importance,” as follows: Program Plan, Experience, and Personnel. Id. And in Section L of the Solicitation, it is clearly stated that the Program Plan sub-factors “are of equal importance,” id. at 176, and that the Experience sub-factors “are of equal importance.” Id. at 177.
The relative importance of the factors— and, within a factor, of the sub-factors — was specified in the Solicitation, and was followed in the Source Selection Plan. The Program Plan factor was the most important, worth 45 points; Experience was next, worth 30 points; and Personnel last, worth 25 points. See Admin. R. at 79. Each Program Plan sub-factor was worth the same five points, id. at 85, 88-89, and each Experience sub-factor was of the same importance, also five points. Id. at 91-93. The evaluations did not deviate from this plan. See Admin. R. at 1012-1160 (score sheets and notes); id. at 989-1003. Beta has cited no authority supporting the proposition that the exact numerical value or even relative weight of the technical factors must be disclosed. This argument does not support the claim that the contract award was arbitrary or unlawful.
2. Exceeding the Solicitation Requirements
Beta argues that the scoring criteria followed by the evaluators was either improper or misunderstood. In particular, it challenges the practice of assigning maximum points for a Program Plan or Experience sub-factor when an offeror “exceeds the requirement in some areas.” Admin. R. at 88, 91; see Pl.’s Mot. Mem. at 13-14, 24; Am. Compl. ¶¶ 37, 71-73. It also objects to the Personnel scoring criteria that take into account whether “proposed key personnel significantly exceed the minimum qualifications for education and experience” and whether a “majority (51%) of the non-key personnel proposed exceed the personnel qualifications listed in the solicitation.” Admin. R. at 94; see Am. Compl. ¶¶ 38-39, 42, 79, 81-83.
According to BAI, to give extra points to an offeror for exceeding the requirements of an SOW, by definition, is to use criteria that are not a part of the Solicitation. Beta further argues that evaluators need to be told exactly how they should determine if requirements are exceeded, which was not done in this process. Beta has no support for these propositions. Implicit in its argument is the notion that every offeror whose proposal meets an SOW’s requirements should receive a perfect score. This is a formula for more, not less, discretion, which would actually increase the chance of arbitrary awards — since contracting officers would often be picking between offerors who are exactly even in evaluated quality. And, in any event, if evaluators can be trusted to determine if proposals meet the requirements, there is little reason to suspect that they cannot figure out when requirements are exceeded.
Beta also contends that the evaluators must have misunderstood these criteria, as they failed to explicitly write down on their score sheets a finding that requirements were exceeded whenever they awarded a maximum score. See Pl.’s Mot. Mem. at 20-21. But since the maximum score could only be awarded if an evaluator believed that requirements were exceeded, the mere fact that this score was given is tantamount to making the finding. Beta also complains that the memoranda narratives do not explain exactly how requirements were exceeded. See id. at 16-19. But the relevant determination was made by the individual evaluators on their score sheets. While it may be true that a rule requiring a detailed explanation of exactly how an offeror exceeded requirements could reduce careless scoring errors on the part of evaluators, it would not necessarily aid in Court review— as this involves precisely the type of second-
3. The Program Plan Average Score
Another ground for the protest is the score of 19 out of a possible 20 points for BAI’s over-all Program Plan. See Am. Compl. ¶¶ 53-56. Beta chose not to press this point in the briefing or oral argument, but was initially of the opinion that it should have received the maximum points, since it received no criticism for weaknesses in the over-all Program Plan.
The scoring criteria for the maximum 20 points for this factor was as follows:
The proposal contains no weaknesses. The Offeror has convincingly demonstrated that the RFP requirements have been analyzed, evaluated, and synthesized into approaches, methods, processes, and plans that, when implemented, should result in the delivery of services that fully address all requirements and will be extremely beneficial to the overall program.
Admin. R. at 86. Clearly, there is more to a score of 20 than merely containing no weaknesses. Beta has identified a necessary, but not sufficient, condition for a score of 20. Moreover, as Maden Tech points out, at least one opinion of our Court has rejected a similar argument. See Int.’s Mot. J. Admin. R. at 6-7 (citing Cubic Def. Sys., Inc. v. United States, 45 Fed.Cl. 450, 468-70 (1999)).
Beta’s claim is just the flip-side of its argument that a perfect score cannot be based on exceeding a Solicitation’s requirement, rejected above. The score of 19 that BAI received in this category is neither arbitrary nor unlawful.
I. The Personnel Scoring Issues
At the hearing, in addition to the claims concerning whether requirements were exceeded, BAI focused on the Personnel scores given by evaluator Gartrell. Beta protests the score of 15 it received. This score fell near the bottom of the “14-24” scoring range, the criteria for which was:
The proposed key personnel satisfactorily meet the minimum qualifications for education and experience and demonstrate knowledge and capability to perform the requirements in the Statement of Work. A majority (51%) of the key personnel are currently employed with the prime contractor. The non-key personnel proposed satisfactorily meet the personnel qualifications listed in the solicitation.
Admin. R. at 94. In awarding BAI a 15, the only notes made by Gartrell appear to read, “one of the key personnel proposed lack [sic] the educational requirement outlined in the Solicitation,” and “[XX (position A) XX] does not detail DoD related experience.” Admin. R. at 1147.
Beta attacks this in a number of ways. It argues that she fails to identify the proposed key employee lacking the educational requirement, thwarting review and therefore amounting to an arbitrary determination. See Tr. at 69-73. It points out that the experience listed for BAI’s [XX (position A) XX] reveals two paragraphs describing experience such as “a wide range of Department of Defense (DoD) classified program security applications” and “life, safety, security systems design and integration for industry supporting DoD operations.” Admin. R. at 920; see Tr. at 76-78. It contrasts this with Bailey’s score of 22 for Maden Tech’s Personnel, which was accompanied by notes stating “not enough info to determine if degrees are in the correct discipline,” Admin. R. at 1050, and “what are their current clearances.” Id. at 1051; see Tr. at 66-68. And it argues that Bailey’s determination that its [XX (position A) XX] satisfies the requirements, see Admin. R. at 1131, and Aronson’s determination that “[a]ll [of BAI’s] key personnel exceed requirements,” Admin. R. at 1159, demonstrate that Gartrell erred in her evaluation. See Tr. at 71-72.
The Court does find it troubling that so few notes were provided to justify what was, in effect, the subtraction of nine points from BAI’s score (and thus three points from its average score for this factor). But even had Gartrell identified the key employee that she
On the other hand, Gartrell’s finding that BAI’s “[XX (position A) XX] does not detail DoD related experience,” Admin. R. at 1147, appears to be clearly erroneous and without any connection to the record facts. First, it is simply not true that this experience is not detailed in the two paragraphs describing this individual. See Admin. R. at 920. No matter how one defines “detail,” the information provided on this individual — including “a wide range of Department of Defense (DoD) classified program security applications,” specific work “for industry supporting DoD operations,” and his having “interfaced with multiple Government and national defense contractors to assure security compliance with NRO, DCID, and DIA requirements,” id. — -would seem to fill the bill. But more importantly, Department of Defense related experience is not listed in the Solicitation as a requirement for this position. See Admin. R. at 121. This is in contrast to the requirements for other key positions, such as Senior Financial Analyst, see id., Analyst, International Security Specialist, and Security Training Specialist. See id. at 122. This finding is clearly erroneous, and it appears to be a major factor in BAI receiving only 15 of the possible 24 points in the range in which it fell. This was, after all, included in one of only two notes made on the score sheet for this sub-factor, and the position is listed second among key positions, which might be taken as some indication of its importance to proposals. See id. at 121. While the exact impact of this arbitrary decision on the score may be hard to discern, this is one of the risks the Government assumes in employing such an evaluation process.
Beta’s argument concerning the Personnel score given Maden Tech by Gartrell is on even more solid footing. With no explanatory notes, Gartrell gave Maden Tech the full 25 points for Personnel. Admin. R. at 1035. The scoring criteria for a 25 read:
The proposed key personnel significantly exceed the minimum qualifications for education and experience and demonstrate knowledge and capability to perform the requirements in the Statement of Work. All of the key personnel are currently employed with the prime contractor. A majority (51%) of the non-key personnel proposed exceed the personnel qualifications listed in the solicitation.
Admin. R. at 94.
Although both the Government and Maden Tech take the position that these criteria are mere “advice” or “guidance,” see Tr. at 192, 235, nothing in the language suggests these are just examples or illustrations. There is no preface with the words “may include,” or use of the disjunctive “or,” or anything indicating that these are not necessary conditions for a score of 25. The most natural reading is that all three must be satisfied.
Of these three requirements, the middle one — that “[a]ll of the key personnel are currently employed with the prime contractor” — -involves no subjective judgment whatsoever. And the record is clear that two of the key employees proposed were not em
Maden Tech was not eligible, based on an objective and readily-verifiable criterion, to receive 25 points for personnel. Gartrell’s score was clearly erroneous. Beta argues that the Court should consider that, scored properly, Maden Tech could have received as few as 14 points (assuming it fell in the next scoring range), but it recognizes that a 22 or 23 — the scores given by the other two evaluators — is more likely. See Tr. at 59-61. It also argues that at least one, if not both, of the other conditions were also not met to merit a 25. But even if the lack of these other conditions rule out a 25, this amounts to nothing more than overkill. None of them is needed to receive the highest score in the next range. And under these circumstances, given that she gave Maden Tech a 25 — albeit one it was not qualified to receive — it is hard for the Court to conclude that the presumed score from Gartrell should be any lower than a 24. The award of a 25 was, nonetheless, clearly arbitrary.
5. The “Not Applicable” Adjustments
One evaluator, Bailey, initially chose not to give BAI a score for two sub-factors he thought were not applicable. He wrote “N/A” on his score sheets in place of a number, and explained, “I marked risks associated with contract performance and contract transition as N/A which lowers the total possible base score to 35” for Program Plan. Admin. R. at 1112.
The first of these sub-factors, “Risks Associated with Contract Performance,” was described as follows:
The Offeror shall identify any risks associated with the assumption of and the performance on the contract, to include how, if there are risks, how they will mitigate them and how they will reduce the contract transition time and the cost, turbulence, and any risk that may be associated with the contract transition (if applicable).
Admin. R. at 176; see also id. at 88. On his score sheet, Bailey wrote for this sub-factor, “[t]hey state none but if they win, is it applicable?” Admin. R. at 1115. The Government misreads (and misquotes) this, contending that Bailey assigned BAI no points “because it ‘did not state’ any contract performance risks that it would face.” Def.’s Resp. Br. at 9. In actuality, BAI explained in its proposal that “there are no risks to contract performance” if it were selected, for it was the incumbent, it was “currently fully manned and on the job,” and “[a] full staff and experience mitigates risks associated with contract performance and BAI has the security professionals on the job with the right qualifications.” Admin. R. at 888. It mentioned three risks facing a new contractor, described the first two, and then explained how it had the experience to cope with the third:
The other risk scenario is [XXXX]. [XXXX ... XXXX]. [XXXX], [XXXX]. This experience will help mitigate similar situations.
Id. at 888-89.
Thus, contrary to the Government’s implication that BAI failed to discuss contract performance risks, BAI instead explained why it believed its experience as the incumbent “will help mitigate similar situations” to the one described. As Bailey wrote elsewhere in his separate notes, BAI really stated there were “[n]o risks as they are the incumbent.” Admin. R. at 1125. On this same sheet of paper, Bailey noted: “Ask to [sic] how to rate on this factor.” Id. Since the bulk of the sub-factor concerns transition issues (with the description mentioning “assumption of’ the contract once and “transition” twice, see id. at 88); its description contains the tentative “if there are risks”; and the sub-factor ends with “if applicable” in parentheses, Bailey reasonably wondered whether it even applied.
The Offeror shall describe the methods and processes that mil be used to transition responsibility and performance from the incumbent contractor to the new contractor (if applicable). The methods shall address the seamless transition of functions, administration and records, and property, and the accomplishment of necessary training and familiarization during the transition to assume all functions and responsibilities.
Admin. R. at 176; see also id. at 89. Since there is no “new contractor” when the incumbent is awarded the contract, there would be no “methods” of transition to address were BAI the successful offeror.
There are only two rational ways to treat sub-factors which do not apply to an offeror. The first would be to give the offeror a maximum score for each sub-factor, on the theory that the matters are fully satisfied if they can never arise. This would have bumped the BAI Program Plan score given by Bailey up to a 38. The other way would be to pro-rate the rest of the sub-factors. Thus, if the offeror received 28 out of an available 35 points, it would be given a score of 36 on a 45-point scale (as it received four out of every five points available).
The Government contends that this was the result of Bailey deciding to give BAI a five for Contract Transition and a one for Risks Associated. Tr. at 128. It concedes “[h]is decision actually is not in the record, but he signed off on the sheet.” Id. It strikes the Court as unusual that Bailey, who took the most copious notes and actually initialed the places on his score sheets where he crossed-out and corrected numbers, see id. at 1118, left no record of this decision to give just one point for the Risks Associated sub-category. He gave BAI no other score as low as “one.”
It gets more unusual when one consults the narratives from the memoranda. In the write-up on BAI’s Risks Associated with Contract Performance sub-factor, there is the misstatement that “[t]he proposal did not address any specific risks, though some may exist even with the incumbent ... the program plan could have addressed some potential risks for contractor performance and how they planned on mitigating these risk[s].” Admin. R. at 994. As was noted above, mitigation of a specific risk was addressed. But no matter this criticism, BAI was given a score of four. Id. This number is consistent with the average score derived from the two evaluators who actually provided a score— Aronson’s five and Gartrell’s two. See id. at 1148, 1135. But it is not consistent with a score of just “one” from Bailey, which would result in an average score of three (rounded up from 2.67). In order for Bailey’s score, when averaged with the others, to keep the average score at four, it would have had to have been at least a four. Given the score of four in the memorandum, and the adjustment necessary due to Bailey’s failure to score the sub-factor initially, it would be arbitrary to impute to Bailey any score other than a four for this subcategory.
But from the six points that were added to the initial 28 that Bailey gave BAI for the Program Plan factor, this would leave only two points for Contract Transition. This would clearly be erroneous. The Contract Transition sub-factor truly did not apply to BAI. The memorandum narrative acknowledged this, as BAI was given the full five points because, “[t]his element was not applicable since BAI is the incumbent contractor.” Admin. R. at 995. Where this leaves us, is that on the score sheets that individual evaluators were required to use, see id. at 83, no score was given by Bailey for the two factors. The narrative supplies a score of four for the Risks Associated sub-factor, which implies that Bailey’s score was a four. The narrative also stated the policy that an inapplicable sub-factor should receive a score of five. Thus, the score of 34 for BAI’s Program Plan
The policy of providing the full “5” when a sub-factor does not apply, however, gives rise to another inconsistency in the scoring, as Gartrell gave BAI just a three for Contract Transition. See Admin. R. at 1135, 1139. This score, too, must be considered arbitrary and erroneous. Since the sub-factor clearly did not apply to BAI, the plaintiff could not have rationally been given just a three.
6. The Staffing Plan and Corporate Support Sub-factors
Beta challenges the specific score it received in the memoranda narratives for two Program Plan sub-factors, Staffing Plan and Corporate Support. See Am. Compl. ¶¶ 57-67; Pl.’s Mot. Mem. at 17-18; Tr. at 115-18. For its Staffing Plan, BAI was given three points. Admin. R. at 995. The only criticism in the narrative was that “[t]he plan was actually a statement and lacked explanations for recruitment and replacement of current employees and capabilities to respond to surges in work requirements, etc.” Id. Beta points out that there was no requirement in the Solicitation that “surges” be addressed under Staffing Plan, see id. at 176, and that it did in fact discuss responses to surges in several places in its general Program Plan proposal. Tr. at 115-18 (citing BAI proposal ¶¶ 1.8, 1.9, 1.9.1, 1.14; see Admin. R. at 885-86, 888). The Government’s response is that “the issue of work surges” is “an intrinsic factor that a person should acknowledge as a problem” connected with staffing, see Tr. at 215-16, and that this is the sort of judgment that is left to the discretion of evaluators. See id. at 216-18. Maden Tech made the same argument that this issue “was intrinsic to the staffing plan.” Id. at 252; see also id. at 250-52; Int.’s Opp. Mem. at 13-14.
For Corporate Support, BAI was also given a three. Admin. R. at 995. The narrative contains a brief summary of BAI’s Corporate Support section. As was noted above, this summary ignores the information in BAI’s second paragraph, even though that paragraph begins with the signal phrase “[m]ore importantly.” Compare id. with id. at 890. And it ignores the description of how BAI’s Senior Review Group operates, even though all three evaluators note this feature in connection with their Corporate Support scores, see id. at 1117, 1140, 1153, and despite the memorandum’s description of the SRG as a “unique management tool” in the Program Plan narrative. See id. at 994. The Government responds that there is “no requirement that the evaluators give more points for” the SRG, Def.’s Mot. J. at 15. Maden Tech brushes the argument aside as “at best ... a mere disagreement with the Source Selection Authority’s decision,” Int.’s Mot. J. at 13-14, and argues that taking note of the SRG under the Corporate Support sub-factor as well as under the over-all Program Plan sub-factor “would have resulted in a double-counting of features that was not authorized by the Solicitation.” Id. at 15.
It somewhat strains credulity that responses to work surge requirements are such an intrinsic part of the staffing plan that these had to be explained under that particular heading. This is particularly the case, considering the amount of detail provided in the Solicitation concerning this sub-factor, requiring a graphic depiction of “organization and reporting relationships,” the specific number of personnel needed, a training program, an initial orientation, and even the provision to employees of “[appropriate written materials” including “necessary work telephone numbers, and a tour of the DAR-PA.” Admin. R. at 176. The description is five sentences long, but mentions work surges not once. It is also hard to see how such an important part of the plan was not noticed when described in numerous places in BAI’s Program Plan proposal. Perhaps the failure to explicitly include these discussions under “Staffing Plan” is the sort of
It is also telling that the Corporate Support narrative omits the very features that BAI found most important, including the [XXXX] that each individual evaluator took note of. The latter omission is a rather strong indication that the narratives were not the product of a consensus reached by the individual evaluators. But mindful of the proper role of Courts in reviewing procurement decisions, if this were all there was to the issue of these two scores, the Court would be inclined to find it too close to a second-guessing of the minutiae to meet the heavy burden of showing arbitrary action.
There is a more fundamental problem with these two scores. In both instances, BAI received a score of three, when the average score for each sub-factor was a four. Compare id. at 995 with id. at 1112, 1135, 1148. The Government’s explanation for this inconsistency is that the sub-factor scores of the individual evaluators have no significance. See Tr. at 132-35. It argues that “on this explanatory memorandum, the actual points given within a subfactor, that was explanatory to go along with the explanatory language.” Id. at 132. It also cites the problems that typically occur when rounded numbers are used in a narrative explanation — essentially, that without some tweaking, the numbers might not add up properly. See id. at 138-39. And it reiterates its position that the memorandum was “a consensus explanation” which, moreover, “was not decisive.” Id. at 140. According to the Government, as long as the point total for an entire factor corresponds to the average score for that factor, the evaluation process is rational.
But rounding cannot explain this discrepancy, as the average was a whole four for each of the two sub-factors. And the process, even as the Government describes it, is hardly rational. This process would require the source selection to be based on the average of the total scores given by the individual evaluators for each category, which is then explained by a narrative, produced by consensus, that can distribute the points however the authors wish so long as the total for the factor adds up to the average score. Even if a consensus method were used to draft the narrative (a conclusion which finds no support in the record and is rejected by this Court), how is it rational to bind the evaluators to the factor averages, but allow them free reign to re-score the sub-factors, subject to the externally-imposed constraint of the factor average?
As an example, it might be the case that, for a sub-factor given scores of 3, 4, and 5 by three different evaluators, in a consensus meeting the person giving the lowest score can prevail upon the other two to accept his or her view of the sub-factor. Under the Government’s method, when one point is deducted from the average for this sub-category, to restore the magical equilibrium the score for another sub-factor must automatically rise by one — otherwise, the factor score does not add up. This must happen even though not one evaluator changes his or her opinion concerning any one of the other sub-factors.
The Government insists, however, that since the Summary Sheet used to document the evaluators’ scores contains cells only for the total factor scores, and not the sub-factor scores, that the latter must drop out of the equation. See Tr. at 138 (citing Admin. R. at 1004). But this defies logic. The only reason that Aronson’s Program Plan score added up to the 43 on the Summary Sheet, for instance, was because she gave BAI sub-factor scores of 20, five, three, five, five, and five. See Admin. R. at 1148. Change any one of these, and the 43 is changed to another number. Change the 43 to another number, and the average score of 36.66 changes. The average score of each factor is directly dependent on the sub-factor scores. An evaluation process which rigidly maintains the former but not the latter is internally inconsistent and, in a word, irrational. If the sub-factor scoring, which directly flowed into the hard averages used in the evaluation, is severed from the explanation of the evaluation, the links between the facts found and the decisions made are broken. It was arbitrary and irrational for the Navy to have given
7. The Second Set of Maden Tech Score Sheets
One final issue, discussed at the hearing, is the presence in the record of a second set of score sheets for Maden Tech. See Tr. at 108-15, 125. On these score sheets, Aronson gave Maden Tech three fewer points in total for the Program Plan factor, and two more points for the Personnel factor. Compare Admin. R. at 1012-17 (Program Plan score of 44), 1022-23 (Personnel score of 23) with id. at 1161-65 (Program Plan score of 41), 1169-70 (Personnel score of 23). On Gartrell’s second set of score sheets, Maden Tech received six fewer points for the Program Plan factor. Compare id. at 1025-30 (score of 43) with id. at 1171-75 (score of 37).
The Government did not address the presence of this second set of scores, see Tr. at 125-224, except for stating, concerning the score sheets, that “there’s no reflection that these points figured in.” Id. at 112. But that may be the very problem with them. The inclusion of scores from an individual, Meade, who was not a member of the evaluation panel is unexplained. Among this set of scores are scoring sheets for Experience that contain different sub-factors than pertain to the Solicitation, and reference paragraph numbers that do not exist in the SOW. Compare Admin. R. at 1176-78 with id. at 107-14. This latter fact makes the scores on these particular sheets either the result of arbitrary action, or part of a different, unrelated procurement decision involving Maden Tech. But the Government has represented that the Solicitation award at issue was Aron-son’s first time in the role of evaluator, Tr. at 208, and the second set of score sheets was included in the record filed by the Government. The Court thus cannot but conclude that they relate to the contract award challenged by BAI.
The Source Selection Plan, however, did not authorize two different rounds of individual evaluations. See Admin. R. at 79-84. And despite the crucial importance of the individual evaluators’ scores in the source selection decision, there is no explanation offered by the evaluators as to why one set of scores should be preferred to the other. If the scores changed because the evaluators looked at the Maden Tech proposal a second time, then BAI was deprived of this same opportunity. And if there is no reasoned explanation for the difference in scores of the same sub-factors and factors by the same evaluators, why were the scores from this other set not used? This is a departure from the Source Selection Plan, and shows unequal treatment of offerors, and is therefore arbitrary. Cf. United Int’l Investigative Servs., Inc. v. United States, 41 Fed.Cl. 312, 320-22 (1998) (prejudice found where one evaluator’s lowered re-evaluation of plaintiffs proposal taken as representative of entire six member evaluation board), aff'd, 194 F.3d 1335 (Fed.Cir.1999) (table).
D. Did the Arbitrary Actions Prejudice the Evaluation of BAI’s Offer?
Having identified a number of arbitrary actions taken by the Navy in the procurement decision at issue, the Court must determine the factual question of prejudice to the protester. See Bannum, 404 F.3d at 1357. Since the methodology used by the Navy was based on the scores of the individual evaluators, for ease of exposition the Court will analyze the results of the arbitrary actions first with reference to those scores, rather than the scores used in the narratives.
First, there is the matter of Gartrell’s Personnel score for BAI. She awarded it 15 points, near the bottom of the 14-24 point scoring range. This score was based, in part, on the arbitrary finding that a proposed key employee lacked the experience he actually had, for a position that did not require it. The other determination affecting this score was that the person proposed for an unidentified key position lacked the educational requirements. Under these circumstances, the Court concludes that the arbitrary determination must have been worth at least half of
Next is the matter of Gartrell’s Personnel score for Maden Tech. Maden Tech clearly could not have received the 25 awarded it, as it is undisputed that Maden Tech did not, at the time of its proposal, itself employ several individuals proposed as key employees. But with the evaluator having found Maden Tech to merit the maximum points for this factor, it would be unreasonable to conclude that the correction of this error would lower its score by any more than one point, to a 24.
Third, we have the requisite adjustments to Bailey’s scores. To make his scores consistent with the scoring used in the memoranda narratives, his Risks Associated with Contract Performance score should have been at least a four, and his Contract Transition score must be a five to reflect the inapplicability of that sub-factor. This would add nine points to the “base” score of 28 from his score sheets, see Admin. R. at 1112, making the Program Plan score a 37, rather than the 34 contained in the Summary Sheet.
To eliminate the arbitrary treatment of BAI’s Contract Transition sub-factor by Gartrell, that score would have to be raised from a 3 to a 5. This would raise the Program Plan score she gave BAI, from a 33 to a 35. Since we are adjusting the individual evaluators’ scores to estimate the possible impact of the arbitrary decisions, as opposed to adjusting the scores contained in the memoranda narratives, no adjustment is needed to correct for the arbitrary treatment of BAI’s sub-factor scores for Staffing Plan and Corporate Support.
Finally, there is the impact of the second set of scores for Maden Tech. No reason exists in the record for the use of one group of factor scores as opposed to the other, made by the same evaluators. Thus, the Aronson Program Plan score for Maden Tech could have been as low as the 41 that is reflected in the set of scores not used. See Admin. R. at 1161-65. And the Gartrell Program Plan score for Maden Tech could have been the 37 appearing in the second set of score sheets. See Admin. R. at 1171-75.
Taking all of these adjustments into account, the resulting scores from Aronson, in descending order of importance (Program Plan, Experience, Personnel) would have remained 43, 30, and 24 for BAI, for a total of 97; and would be 41, 26, and 23 for Maden Tech, for a total of 90. The Gartrell scores for BAI would be 35, 22, and 19, for a total of 76; and for Maden Tech would be 37, 25, and 24, for a total of 86. Bailey’s scores for BAI would be 37, 27, and 23, for a total of 87; and for Maden Tech would remain 40, 16, and 22, for a total of 78. Using the adjusted scores would result in an average score of 86.66 for BAI and an average score of 84.66 for Maden Tech. Although the Court cautions that this is merely an estimate of what the scores could have been absent the arbitrary aspects of the evaluation process, it is enough to demonstrate prejudice to BAI. Beta could have been the offeror with the highest Technical Proposal score, requiring a best value determination between it and the other two offerors. It would have thus had a substantial chance to receive the award, but for the arbitrary actions of the Navy.
III. CONCLUSION
The Navy acted arbitrarily and capriciously in its evaluation of proposals submitted in response to Solicitation No. N00174-03-R0044. Plaintiff Beta Analytics International, Inc., would have had a substantial chance to receive the contract award absent the arbitrary evaluation. Accordingly, BAI’s motion for judgment on the administrative record is GRANTED and the motions of the Government and Maden Tech are DENIED.
This decision only addresses the merits, and not the form of any injunctive relief that may be appropriate. In light of the contract performance that has already occurred, the parties shall file a joint status report on or by August 28, 2005, addressing injunctive options and proposing a schedule for further proceedings.
IT IS SO ORDERED.
. Oddly, "Past Performance” is listed below the three Technical Proposal factors in a list ranked "in order of importance,” but is also described as "equal in value to factors 1 through 3 combined,” Admin. R. at 79. Fortunately, all three offerors received the same "Excellent” rating for Past Performance, see id. at 989, and no issue is raised concerning this factor.
. [XXXX] received an final averaged score of [XX]. Admin. R. at 1004.
. Tacked on to the end of each discussion are also a couple of sentences on Past Performance, which were identical except for the name of the offeror. See Admin. R. at 993, 998, 1003.
. Admin. R. at 1151.
. Admin. R. at 1138.
. Admin. R. at 994.
. Admin. R. at 1152.
. Admin. R. at 1139.
. This is in contrast to the treatment of [XXXX], whose Program Plan average scores included a [XXXX] for the first and third sub-factors, and a [XXXX] for the fourth, see Admin. R. at 1064, 1077, 1099, which resulted in memorandum scores of [XXXX]. See id. at 998-99. Had each been rounded up to the nearest whole number, the total score for the factor would have been one point higher than the average, and to avoid this, apparently, the first was lowered.
. This analysis of BAI's proposal is also inaccurate. While BAI did state that it would avoid three situations posing risk to a new contractor, it specifically explained how it mitigates the third of these — "short notice surge requirements.” See Admin. R. at 888-89.
. It appears that BAI did address these issues, in places in the Program Plan proposal other than the "Staffing Plan” section. See, e.g., Admin. R. at 885 (paragraph 1.9 stating BAI "has responded to short notice requirements by [XXXX]”), 886 (paragraph 1.9.1 explaining how BAI "has consistently been successful in adjusting staffing levels to meet government requests for more staff”), 888-89 (explaining response to short notice requirements).
. The one exception, BAI’s score of a four in the subcategory of "Experience in Communications Security,” is explained by rounding adjustments. Of the other five sub-factors, three were rounded up to the next whole number from a 4.66 or a 3.66, and two were rounded down from a 4.33. If the Communications Security score were rounded up from 4.66 to a five, the resulting total for the factor would have been a 27, rather than the 26 (rounded down from 26.33) derived from the average of the individual evaluators’ total score for the Experience factor. See Admin. R. at 995-97, 1004, 1118, 1141, 1154. While there would have been nothing wrong with this alternative, provided that a disclaimer concerning the discrepancy due to rounding errors were provided, the Navy apparently chose a more confusing approach.
. The sub-factor scores of [XXXXj also corresponded exactly to the individual evaluators’ averages. See Admin. R. at 1000-02, 1070, 1083, 1105.
. That is, to the extent comments were made. Gartrell gave Maden Tech the full 25 points without comment. See Admin. R. at 1035-37.
. The area on the BCM for the signature of the "Approving Official” was left blank. Admin. R. at 1212.
. The price difference is stated as "11.3%” instead of the twelve percent figure used in the other memorandum. See Admin. R. at 1003, 1234. Both figures are wrong, as Maden Tech’s proposed price was in actuality 12.8% higher than [XXXXJs proposed SCXXXXJ
. One might wonder if these sheets related to a different contract award. The government, however, represented that the Solicitation award at issue was the first time that Aronson had evaluated contract proposals. Tr. (Sept. 1, 2004) at 208.
. It is not clear whether this standard comes from 5 U.S.C. § 706(2)(A), as being "otherwise not in accordance with law,” or under 5 U.S.C. § 706(2)(D), as an example of an action taken "without observance of procedure required by law.” Presumably, 5 U.S.C. § 706(2)(B)-(D) are mere elaborations on the 5 U.S.C. § 706(2)(A) theme, and thus apply in all bid protests.
. The recent Federal Circuit opinion in Bannum v. United States could be read to eliminate this initial step. See Bannum, 404 F.3d at 1351 (describing only "two steps” in bid protests, first the "rational basis or contrary to law” determination and then the prejudice determination).
. Beta’s claim concerning 48 C.F.R. § 9.504(e) does not amount to a "clear and prejudicial violation" of the regulation. Impresa Construzioni, 238 F.3d at 1333. The basis of this claim is the absence of any documentation in the record showing that a conflict of interest determination concerning Maden Tech’s ability to compete for the contract was performed by the Navy. See Am. Mem. Supporting PL’s Mot. J. Admin. R. at 26-28. This Court had previously held that such documentation is not necessary in the absence of a substantive issue of a potential conflict of interest, Beta Analytics, 61 Fed.Cl. at 228, and that no evidence had been identified supporting the existence of such a conflict. Id. at 227-28. Beta has still not identified any evidence that would support a determination that an organizational conflict of interest would arise from award of the contract to Maden Tech, and thus even the failure to conduct the evaluation would not be prejudicial.
. As was noted above, the narratives were identical in the two documents. Compare Admin. R. at 989-1003 with id. at 1222-30.
. The first five counts in BAI’s complaint covered, in order, challenges to: 1) the Program Plan factor evaluation; 2) the staffing plan sub-factor evaluation; 3) the corporate support sub-factor evaluation; 4) the Experience sub-factor evaluations; and 5) the Personnel factor evaluation. See Am. Compl. HH 51-84. For ease of organization, given the cross-cutting nature of many of the issues raised, the Court discusses these claims based on the issues or evaluator involved rather than by the count under which they fall.
. Moreover, handwritten notes on ruled paper that appear to be Gartrell's, albeit somewhat cryptic, have "-1 BA” written next to these positions amidst notes on BAI’s proposal, which suggests these were her concern. Admin. R. at 1062.
. Perhaps coincidentally, on the Chairperson's Summary Sheet, it appears as though the score of "36” was initially written down as Bailey's Program Plan score for BAI, with a "4” written over the "6.” See Admin. R. at 1004.
. Maden Tech’s latter point is undermined by the very nature of the Program Plan sub-factor, which embraces the "comprehensive” plan, including the enumerated sub-factors. See Admin. R. at 175-76. Moreover, Maden Tech’s staffing plan was mentioned in the Program Plan narrative of the memorandum, as well as discussed under the Staffing Plan sub-factor narrative. See id. at 990, 991.
. This is not to say that adjustments that are made for rounding purposes would be arbitrary.
See note 8, above.