BILL ANALYSIS
AB 173
Page 1
Date of Hearing: April 29, 2009
ASSEMBLY COMMITTEE ON EDUCATION
Julia Brownley, Chair
AB 173 (Price) - As Amended: April 14, 2009
SUBJECT : Low performing schools
SUMMARY : Requires the California Department of Education (CDE)
to contract for the development of a new measure to replace the
Academic Performance Index (API), and requires the CDE to
convene a new advisory board to provide general guidance and
make recommendations toward that end. Specifically, this bill :
1)States Legislative intent to adopt a new indicator of academic
performance that measures pupil-level growth over time,
replaces the API, serves state and federal accountability
functions, and is ready for implementation by the 2015-16
fiscal year.
2)Requires, subject to the availability of federal funds for
this purpose, the CDE to contract for the development of a new
indicator that:
a) Measures pupil-level growth on the Standardized Testing
and Reporting (STAR) tests, the English language
development test, and the high school exit examination.
b) Evaluates and determines the most effective way to
modify existing tests to allow the indicator to measure
pupil growth over time
c) Allows the state to measure Adequate Yearly Progress,
identify schools for Program Improvement, and otherwise
comply with the federal No Child Left Behind Act (NCLB).
d) Serves state accountability functions
e) Allows the state to make required assurances under the
American Recovery and Reinvestment Act.
f) Distinguish among low-performing schools and local
education agencies and streamline eligibility requirements
in order to better target state resources.
AB 173
Page 2
3)Requires the CDE to convene a broadly representative advisory
board consisting of representatives from the state board, the
Secretary for Education, the Department of Finance, the
Legislative Analyst's Office, parent groups, school districts,
and education researchers to provide general guidance and make
recommendations relative to modifying assessments, academic
content standards, performance expectations, and eligibility
criteria for state support and resources.
4)Requires, subject to a budget appropriation, the CDE to
contract with a consultant for independent oversight of the
project to develop a new academic performance indicator,
requires the Director of Finance (DOF) to review the request
for proposals for the contract, and requires specified written
reports from the consultant.
5)Requires these provisions to be implemented using federal
funds received under NCLB upon the approval of an expenditure
plan by DOF.
EXISTING LAW
1)Requires the Superintendent of Public Instruction (SPI), with
the approval of the State Board of Education (SBE), to develop
and implement the API to measure the performance of schools,
and to include a variety of indicators, including achievement
test results, attendance rates, and graduation rates in that
measure.
2)Requires the SPI to establish an advisory committee to provide
advice on all appropriate matters relative to the creation of
the API.
3)Directs the advisory committee by July 1, 2005, to make
recommendations to the SPI on the appropriateness and
feasibility of a methodology for generating a measurement of
academic performance by using unique pupil identifiers and
annual academic achievement growth to provide a more accurate
measure of a school's growth over time.
4)Establishes the STAR program to test academic skills in grades
2-11 and to report individual and aggregate results, the
English language development test to test the acquisition of
English skills and to report individual results, and the high
AB 173
Page 3
school exit examination as a high school graduation
requirement to report individual and aggregate results.
FISCAL EFFECT : Unknown
COMMENTS : The SPI established, pursuant to SB 1 X1 (Alpert),
Chapter 3, Statutes of 1999-2000 First Extraordinary Session, an
advisory committee to advise the SPI and the SBE on all
appropriate matters relative to the creation of the API. SB 1
X1 also requires the SPI, with the approval of the SBE, to
develop the API to measure the performance of schools, and to
include a variety of indicators in that measure, including, but
not limited to, achievement test results, attendance rates, and
graduation rates. Currently only achievement test results are
incorporated into the API, and the API is configured to produce
scores measuring a school's static performance at each grade
level, in each content area, in each year, at one point in time.
In addition the SPI also produces a "Growth API" that compares
this static performance from one year to the next by comparing
cohort or group scores. This growth API, however, does not
measure true value added for a specific group of students and is
not based on the year-to-year information for individual pupils;
in other words that measure may only be reflecting the
differences in cohorts of pupils that were in one grade level
over two different years, rather than actual growth for a fixed
set of students over time.
What is the impact of not being able to compare individual test
scores or the aggregate API over time? Even though individual
STAR test scores look the same from one year to the next and
allow a relative comparison to other students in the same grade
level in a given year, a student's scores are not comparable
across grade levels; this means that the student, parents, and
teachers can not tell if a student has improved or is achieving
at a lower level from one year to the next based on the test
scores that they receive. In short, we don't know whether the
520 that a student scores this year is higher, lower, or the
same as the 500 that student scored in the previous grade. The
primary impact of this shortcoming is that we are unable to
determine whether a specific instructional program designed to
maintain a student's academic growth or to accelerate that
student's growth is actually doing so. In the same way, the
inability to compare API results from one year to the next,
except through the current growth API that effectively measures
the results in one grade level for two successive and different
AB 173
Page 4
cohorts of students, restricts the state's ability to make
judgments about how a school's or district's instructional
program impacts its students' academic progress over time. In
other words, we are unable to tell whether school reform or
school improvement efforts are actually achieving results in
terms of academic growth in a school or district; even if we
could see growth, we are unable to really measure how great that
growth is. Clearly very large changes from one year to the next
would show up as very large changes in individual scores and in
the API, but large dramatic changes in one year are not
generally the result of school improvement. The lack of ability
to make comparisons over time has also hurt the state in terms
of its ability to take advantage of opportunities, provided as
part of the federal accountability system defined under No Child
Left Behind, to adopt more flexibility in establishing how
schools and districts meet the standard of Adequate Yearly
Progress (AYP); this in turn has implications for schools and
districts moving into Program Improvement status and eventually
being mandated to accept various forms of state intervention,
including the possibility of state takeover.
Why can't we make these comparisons over time? There are three
primary obstacles that face any large-scale assessment and
accountability system that attempts to generate measures that
allow valid comparisons of achievement over time: cohort
instability, content discontinuity, and score incomparability.
Cohort instability simply refers to the fact that a school or
district won't have the same set of students in one grade this
year that it had in the previous grade the year before; in other
words, students move in and students move out of schools and
districts. This means that an aggregate measure, like the API,
is based on a different set of student scores in each of those
two years, and if the students from one year to the next are
different, then we can not know whether a change in the API
results from the work that the school, district, students and
parents have done or simply from the fact that the academic
achievement of the two sets of students is different. While
this problem may have an insignificant effect in some schools
and districts, California has schools and districts with
year-to-year turnover that exceeds 100 percent - meaning that
more students have left and come into the school or district
over the last year than were enrolled last year.
Content discontinuity refers to the fact that content upon which
scores and measures are based may not create a continuous
AB 173
Page 5
progression across all grade levels; the simplest examples of
this are in the California mathematics standards beginning at
grade 8 and in the English language arts standards beginning at
grade 9. The standards above those grade levels were developed
to recognize the variety of courses and course sequences that
exist across California middle and high schools, so the
standards exist more as a grade level block, rather than a
sequence of grade levels or content. In a school one student
may take a math sequence of algebra, geometry, second-year
algebra and pre-calculus, while another takes pre-algebra,
algebra, statistics and no math class; this content
discontinuity creates an oranges and apples problem that
complicates and possibly invalidates comparisons of aggregate
achievement across the grade levels for that school. This also
creates a problem for comparing individual scores; for example
the student taking geometry and then second-year algebra sees
their test scores go up from one grade to the next, but if that
same student had taken second-year algebra first and then
geometry (as sequenced in some schools), that student's scores
would have gone down from one year to the next. In addition,
since the individual grade level tests in a given content area
can not, in the time allotted for testing, test all of the
content standards for that grade level and content area, there
is a sampling of content done for inclusion on the tests. So
even if the content standards were completely sequenced across
grade levels, the tests drawn from those standards still may not
reflect a continuous sequence of content. Any discontinuity in
content creates an oranges and apples problem such that growth
in achievement is not reflected in a student's scores across two
years - what would be reflected would simply be that student's
achievement on two different sets of content.
Score incomparability refers to how the underlying scores on the
tests are created. Even if content discontinuity were not at
issue, in order to compare an individual student's test scores
over time the scales on which the test scores are measured at
each grade level would have had to have been statistically
produced together for all of the grade levels so that there was
a progression of possible scores up the grades (one process for
producing a score scale that has this progression is referred to
as vertical scaling); other statistical mediation approaches
might also be used in order to make those scores comparable. As
an example, take two teachers who both grade their students'
tests on a scale of 0 to 100; can we say that a 90 on one
teacher's test is the same as a 90 on the other? Clearly not,
AB 173
Page 6
even if the test content were the same, because we know that
teachers grade differently and that their perceptions of what
gets a score of 90 may be different. However, if we took all of
the tests from both classrooms and examined the results, we
could produce a common scale that reflected the difference
between the two scores of 90 and every other score in the two
classes, and that allowed cross-class comparisons. This same
sort of statistical process would have to be used to allow
scores on a series of grade-level tests to be compared across
those grade levels. The scale scores on the tests in the STAR
program were developed independent of each other and thus do not
validly support this type of cross grade level comparison. Some
would argue that the cut-point or level setting process that is
used to establish the STAR performance levels (e.g., basic,
proficient, advanced) mediates this shortcoming in the scale
scores, but the judgmental nature of such a standard setting
would require extensive statistical validation before it was
determined that this process supports comparisons over time. In
addition, the individual scores produced in the STAR program
form the basis for both the API and for measuring AYP; if the
underlying test scores do not support comparisons over time,
then these resulting aggregate measures will suffer from the
same problem.
How can the test scores and aggregate growth measures be made to
be comparable over time? There are many methodologies across a
broad spectrum of approaches that could be employed to either
eliminate or work around this problem. On one end of that
spectrum might be a full vertical scaling effort. In this
approach test questions from one grade level test would be
administered to students in adjacent grades and the results
would be used to create a common scale across the grade levels.
Thus a student's growth could be tracked as the student moves up
the common scale that runs from the lowest grade level up
through the highest scores at the highest grade level. This
approach is dependent upon the underlying content of the tests
being continuous; in other words movement on the common scale
has to reflect a progression through the content. It is possible
that applying this approach to California might mean a
re-examination of the content standards and test content in
order to ensure that this content continuity exists. Since the
API is an aggregation of STAR test scores, vertical scaling of
the test scores would eliminate most of the problems associated
with using the API to compare school and district performance
across time. At the other end of the spectrum might lay
AB 173
Page 7
approaches that rely on statistical procedures to estimate or
project what score, on the average, should be achieved in a
given year based on the previous year's score or other
information. In this way a student's actual score can be
compared to the projected score, and a judgment could be made
about whether the student grew at a greater or lesser rate than
the average. This same sort of statistical mediation could be
used directly on an aggregate measure, such as the API, without
applying the approach to individual test scores.
There are also many other approaches and methodologies that
could be employed to allow comparisons over time. As with any
large-scale statistical procedure, the trade-off among these
procedures is generally between the increased validity and
accuracy of the resulting measures and the comparisons that are
made using them, and the cost and time involved in implementing
that approach. At the two ends of the spectrum, a vertical
scaling process would be the most involved of the approaches,
while direct statistical mediations would be less costly and
faster. On the other hand statistical mediation does not solve
the underlying problems, but works around them; thus problems
such as content discontinuity would still exist and pose a
potential threat to the validity of the conclusions and
comparisons that we make with these test scores and
accountability measures.
In calling for the development of a new indicator, this bill
does not presume that any of these approaches are best, it does,
however, constrain the contractor and advisory board to develop
an indicator that measures both growth and performance at both
the pupil and aggregate level. In fact, the bill requires that
a single measure be developed to reflect both individual
performance and growth on all tests and aggregate performance
and growth across all tests. This one-size-fits-all approach
may be overly restrictive, since test scores are specific to a
test, a score scale, and a set of content in a grade level
(e.g., grade 6 mathematics); aggregate measures, however, by
their nature combine the information from a number of different
component tests to indicate overall performance. An analogy
making this point might be the evaluation of an automobile.
Tests might generate information measured in miles per gallon,
miles per hour, time elapsed, cost per repair, or number of
injury accidents per thousand miles, but we could not build an
overall indicator of performance for that automobile using any
of those specific measures; instead we would need an overall
AB 173
Page 8
aggregate measure that appropriately reflected all of the
component information.
The bill is very specific in stating that this new measure
replaces the current API, and thus rules out any approach that
would simply provide a direct statistical mediation of the API
in order to establish longitudinal comparability of that
aggregate measure of performance.
This bill also requires the engagement of two contractors and
the appointment of a new advisory board. It is unclear what
differentiates the work of the main contractor developing the
new indicator and the advisory board. Since the end product of
this development process is simply a statistical methodology,
and not a physical deliverable such as an extensive software
system or hard product, which would be implemented by the CDE,
either the contractor or the advisory board (with CDE support)
could undertake this task. Also, the addition of an independent
oversight contractor may be more useful in situations of
large-scale development such as software or product development,
rather in a more research based activity such as this; for
example, the independent oversight might be more appropriate if
targeted at the CDE implementation of the methodology, rather
than at the development of the methodology itself.
The bill creates potential problems with timing by expressing
the intent of the Legislature that the new indicator be operable
by 2015-16, but also requiring the appropriation of funds and
the approval of the DOF before any of the provisions of this
bill are implemented or either the development or oversight
contract can be let. This requirement could mean that the
Legislature's intent is not acted upon, or that development work
on the new indicator could be delayed until funds are available
or until DOF provides approval. An additional issue of timing
is created by the lack of direction in the bill as to when and
how the new indicator would be implemented or integrated into
use; there is also no direction in the bill as to who would have
the responsibility or authority for implementing this new
indicator. Making a change in how we measure progress of both
students and schools potentially has significant impacts on
individual students, schools and school districts in terms both
the state and the federal accountability system, as well as in
overall school reform; a change of this significance should have
the involvement of the Legislature.
AB 173
Page 9
Related legislation: This bill is one of four bills that propose
changes to the state's accountability system, specifically to
the API measure, and that will be heard by the Assembly
Education Committee this month. Those four bills are AB 173
(Price), AB 429 (Brownley), AB 1130 (Solorio), and AB 1435 (V.
Manuel Perez). The last page of this analysis provides a
side-by-side comparison of key features of these bills. AB 429
(Brownley), pending in the Assembly Education Committee,
requires examination of methods for making and reporting valid
comparisons of individual academic performance over time and for
making potential improvements in the API, so as to be able to
measure and report both a student's and a school's academic
growth over time. AB 1130 (Solorio), pending in the Assembly
Education Committee, requires examination of methods for making
and reporting comparisons of school and district academic
achievement over time based on a cohort growth measure. AB 1435
(V. M. Perez), pending in the Assembly Education Committee,
requires the examination of assessment data related to the
acquisition of English language by English learners (EL) and of
EL proficiency with respect to making potential improvements in
the API.
Previous legislation: AB 2776 (Mullin), held in the Senate
Appropriations Committee in 2008, would have required
examination of the collection of individual student data, the
state's emerging data systems, the possibility of making real
comparisons of student performance over time, and the long-term
availability of assessment data related to the acquisition of
English language by English learners with respect to making
potential improvements in the API. AB 2478 (Huffman), held in
the Assembly Appropriations Committee in 2008, makes changes in
the issues on which the advisory committee advising the SPI on
the API is required to make recommendations. AB 519 (Mendoza)
would have required the incorporation of data regarding the
availability in high schools of a course of study that fulfills
University of California and California State University
admission requirements into the API, and the submission of a
plan for incorporating dropout data into the API. This bill was
later amended into different subject matter and author
(Committee on the Budget), and enacted as Chapter 757, Statutes
of 2008. SB 219 (Steinberg), Chapter 731, Statutes of 2007,
makes changes in the calculation of and in the process for
revising the API. AB 400 (Nunez), vetoed in 2007, would have
required the incorporation of additional measures of performance
into the API, including the rate at which pupils are offered a
AB 173
Page 10
course of study that fulfills University of California and
California State University admission requirements. AB 2167
(Arambula), Chapter 743, Statutes of 2006, establishes a
specific methodology for including graduation rates, as
previously required, in the API; also requires the SPI to report
annually to the Legislature on graduation and dropout rates in
the state. SB 1284 (Scott), held in the Assembly Appropriations
Committee in 2006, would have updated and made technical
amendments to statutes that establish the API. SB 1448
(Alpert), Chapter 233, Statutes of 2004, reauthorized the STAR
Program. SB 257 (Alpert), Chapter 782, Statutes of 2003,
requires the advisory committee established to advise the SPI on
the API to make recommendations to the SPI on a methodology for
generating a "gain" score measurement to provide more accurate
measure of a school's growth over time. AB 1295 (Thomson),
Chapter 887, Statutes of 2001, makes changes to the API to allow
small school districts to receive an API score, receive growth
targets, and performance awards. SB 1 X1 (Alpert), Chapter 3,
Statutes of 1999-2000 First Extraordinary Session, known as the
Public Schools Accountability Act (PSAA), authorizes the state's
current accountability program, including establishment of the
PSAA Advisory Committee and development of the API. SB 2 X1
(O'Connell), Chapter 1, Statutes of 1999-2000, authorized
development of the high school exit examination, and established
a timeline for requiring passage of that examination in order to
qualify for the high school diploma. SB 376 (Alpert), Chapter
828, Statutes of 1997, authorized development and implementation
of the STAR Program.
REGISTERED SUPPORT / OPPOSITION :
Support
None on file
Opposition
None on file
Analysis Prepared by : Gerald Shelton / ED. / (916) 319-2087
AB 173
Page 11
Comparisons of Current Law, AB 429, AB 1130, AB 1435, and AB 173 on
Key Elements in the Proposals to Improve California Assessment and
Accountability Measures
------------------------------------------------------------------------------------------
| | Current | AB 173 | AB 429 |AB 1130 (4/22/09 | AB 1435 |
| | Law | (4/14/09 | (introduced) | ver.) |(introduced) |
| | | ver.) | | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
|Primary |Developed |Replace API |Facilitate |Facilitate |Add CELDT |
|proposal |API and |with new |growth |growth |and EL |
| |advises |measure |comparisons |comparisons |proficiency |
| |SPI on | | | |to API |
| |relevant | | | | |
| |matters | | | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
|Improves |Created |Both with a |Both |Aggregate |Aggregate |
|individual or |aggregate |single |individual |accountability |accountabilit|
|aggregate |accountabil|measure |test scores |measure |y measure |
|measures? |ity | |and aggregate | | |
| |measure | |accountability | | |
| | | |measure | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
|Who makes |API |New advisory |API advisory |API advisory |API advisory |
|recommendations|advisory |board with |committee |committee |committee |
|? |committee |independent | | | |
| | |oversight | | | |
| | |consultant | | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
|Deadline for |July 1, |None - not |July 1, 2011 |None |July 1, 2010 |
|recommendations|2005 |implemented | | | |
|? | |until the | | | |
| | |Legislature | | | |
| | |appropriates | | | |
| | |federal funds | | | |
| | |for this | | | |
| | |purpose with | | | |
| | |DOF approval | | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
AB 173
Page 12
|Recommendations|SPI |Not specified |SPI who |SPI and SBE |SPI |
| provided to | | |forwards to | | |
|whom? | | |SBE, | | |
| | | |Legislature, | | |
| | | |Dept of | | |
| | | |Finance | | |
|---------------+-----------+--------------+---------------+-----------------+-------------|
|How are |SPI may |Not specified |Upon |SPI may |SPI may |
|recommendations|implement | |Legislative |implement with |implement |
| implemented |with SBE | |action that |SBE approval, |with SBE |
|and when? |approval | |appropriates |SBE may |approval |
| | | |funds for this |implement as | |
| | | |purpose |part of NCLB | |
| | | | |plan, or state | |
| | | | |may as part of | |
| | | | |any other | |
| | | | |federal plan | |
| | | | |submitted | |
------------------------------------------------------------------------------------------