How Do We Know?

Unlike more pure subjects like physics and mathematics, foreign language education, with its myriad intangibles, is not easy to measure. Yet we, Tarot fools, stride confidently along the edge of the cliff, confident that the data we use to measure gains in our students is accurate.
Where does this cavalier trust in data come from? Is it warranted? How does the existence of a common assessment based on a book justify, as Grant asked, the passing of a student to the next level via a bottleneck, forcing really talented kids, because they currently care nothing about verb conjugations and object pronoun agreement, into stopping their language study.
Obviously, people who work in the area of assessment at the district level are the ones qualified to ask this question. As a mere classroom teacher with little knowledge of how data works, I appeal to your knowledge and insight. Here is the question again – how do we know that our use of data in foreign language assessment is accurate?