> ikhG Tbjbjَ N@]82No44444666666$ocZ44Z644666444466Vn44$lB(
Statement of James W. Conroy, Ph.D.
In the matter of Rosie D. v. Swift
Regarding the Matter of Fairness in Sampling
1. PURPOSE
I was asked to design a sampling process that would produce an unbiased representation of Massachusetts children who have serious mental health needs and who receive behavioral health services under Medicaid and other state agencies.
The sample was drawn to allow mental health clinicians to review the conditions of these children and to determine whether they needed and would benefit from intensive home and community-based services. The sample was intended to be used in the context of this case, so that generalizations could be made about the class and a much larger group of children with serious mental health needs in Massachusetts.
2. SAMPLING PRINCIPLES
My principal concern in the design of this sample is to be certain that the sampling process is unbiased. In everyday language, this term means fair. Participants in this matter must have confidence that the people in the sample were not chosen in some purposeful fashion in order to support or refute any particular conclusion.
In the present case, given finite resources, and the need to examine large-scale phenomena, it is my opinion that fairness is the primary criterion for a good or valid sample. Hence we used random selection within each group of interest. Sample precision in this case is a secondary concern. When it is not feasible or practical to conduct clinical reviews of large samples, samples drawn fairly, without bias, can be generalized to the populations from which they were drawn not with as much precision, but with reasonable confidence. Through such methodology, one is fairly and fully able to detect large or consistent effects among the sampled children that will reflect true effects within the population of all children in the class.
3. SAMPLING EXPERIENCE
In my ongoing research and evaluation work, I am frequently called upon to design sampling strategies. Nearly all of my work involves programs for people with disabilities. Some of the projects demand data from entire populations, some demand large sample data, and some demand small sample data. Each sampling activity must balance available resources and time against the desire for the highest achievable precision. This tradeoff is ever-present. My work requires many kinds and magnitudes of sampling.
As one example, between 1994 and 1998, I designed successive annual samples of people with developmental disabilities who had moved from institutions to community homes, beginning with N=300, then 600, then 900, and then 1200. This work was intended to assess the outcomes of this transition, and the quality of the new community services. This work has now been made permanent in California statute (Welfare & Institutions Code 4418.1). For another example, in Oklahoma, I am currently implementing a random sample of people with mental retardation who are receiving community services and supports. As a third example, in 1998, I designed small sample focus group activities in 6 states on behalf of the National Center for Outcomes Research (N=3 per state).
These examples are only meant to serve as illustrations of the hundreds of times that I have used sampling theory and generally accepted standards of sampling practice in my disability-related research. My complete CV is attached to this document.
4. OBTAINING THE NECESSARY DATA
I designed the process for drawing the sample for the review, based upon the four subgroups defined by the clinical professionals charged with conducting a clinical review of class members needs. The clinicians defined four groups of primary interest in the case as follows:
Nature of the groupShorthand Label1.Children who received crisis services and who had received such services in the past 2 yearsCrisis2.Children living at home and receiving some form of home-based services from an MCOHome3.Children who had been hospitalized and who had been previously hospitalized in the past 2 yearsInpatient4.Children living in a residential program, usually funded by DSS, DMH, or MBHPResidential
The clinicians determined that these four subgroups or strata would best capture the nature of the entire population of children with serious mental health conditions.
In May of 2003, the plaintiffs attorneys then requested lists of all children in Massachusetts in those four categories, at a given point in time. The point in time was eventually established by the Court as the most recent date for which this information could be provided, but no earlier than November 2002.
The names of children provided by the MCOs were children who were in each of these four categories during March 2003. The names of children provided by the state agencies (DMH and DSS) were for children in certain categories during November 2002.
Assembling consolidated lists from many sources and software formats proved to be time-consuming. This task was conducted by a database assistant. It is my understanding that there were considerable delays in obtaining these lists that required court action, and that the lists and categories had to be ordered by the Court.
The lists that were produced, particularly by the MCOs, were often far more comprehensive than expected, and included, in some cases, all children who received any mental health service during the designated time period. Therefore, the databases had to be analyzed, sorted, and reorganized.
I instructed the plaintiffs how to do this reorganization, based upon service codes provided by each MCO as well as common sense, and monitored the organization of the process as it unfolded to ensure that it was fair, accurate, and reliable. Through this process the raw data files were organized, and I was able to obtain large lists of children, from all MCOs and state agencies, who fell into each of the four categories. A detailed description of each step in the sorting process, including how the raw databases from the MCOs were organized, to ensure consistency with my instructions and oversight of the process, is attached to my report.
5. SAMPLE DESIGN
Once the four groups of children had been identified, compact disks were sent to me at COA. I and my team converted several kinds of data files so that all the children were placed into a single spreadsheet file in Microsoft Excel. Excel contains a random number generator that would enable rapid and easy selection of random samples of each group of children.
Based on past experience with similar databases, I determined to draw several times more children at random from each group than we actually intended to review. I decided to draw three times as many children as were needed for the client Review, in order to account for missing addresses, deaths, moves, and other factors.
I understood that the plaintiff would need to obtain consent from the parents, foster parents, or guardians of each child who was randomly selected in order to review his/her records and speak with relevant treating clinicians and staff. Based upon my experience in doing such reviews, I knew that many members of the original sample would not be available for clinical review, either because they had changed addresses, left the service system, could not keep appointments for the review, or otherwise would not consent to participate in the review. Therefore, I recommended that the actual number of children drawn in the first sample be at least three times the number needed. This meant that I drew, at random, at least three times as many children as we were aiming to complete clinical reviews for. I drew a total of 165 children in the four groups, while aiming to complete reviews for about 45.
I used these four lists to draw the samples. James Garrow from my firm actually drew the samples under my direction. He selected these people using a simple random sampling method. Excel assigned a random number between 0 and 1 to each child within each of the four groups, then ordered the numbers and corresponding names from lowest to highest, and selected the first 30 or 45 (depending on the particular group). Mr. Garrow then sent the names of the selected children to plaintiffs counsel.
The overall configuration of populations and samples is summarized in the table below.
Group
Population
Sample Draw
Supplementary SampleTarget for Clinical ReviewsCrisis765303010Home62645*10Inpatient49845*10Residential13374515322616545* These cells were originally envisioned as 30, but because of a last-minute miscommunication, my associate Mr. Garrow drew more than we thought we needed. This, of course, did no damage whatsoever to the design of the samples. The clinicians still had a list of names, completely random, and they would stop Reviews as soon as they obtained 10 in the Home and Inpatient categories.
The total numbers of children in the four strata are shown in the column headed Population. The next column, Sample Draw, shows how many children in each group I drew, purely at random. Every child in each group had an exactly equal probability of being drawn in the sample. (Because the Crisis group proved extraordinarily difficult to find and obtain consent, records, and other information, I was later asked to randomly select another 30 so that 10 Reviews could be completed.) On the right is Target for Clinical Reviews, showing how many randomly selected children Reviews we were actually aiming for in each category or stratum.
Based upon my understanding of the available resources, I determined the total number of children who should be reviewed in order to obtain valid and generalizable results. This number was approximately 45, plus the 9 named plaintiffs.
6. DISPOSITION OF THE SAMPLES
It is my understanding that plaintiffs counsel then attempted to obtain valid addresses and contact each selected child. Not surprisingly, addresses were not available for all children, and even when they were, many did not respond. After all children with addresses were contacted, I was told that there were only a few children in the crisis category that had responded. Therefore, another sample of 30 members of this group or stratum was selected by Mr. Garrow and forwarded to the plaintiffs.
Eventually, we were able to obtain at least 80% of the desired number of children in each group or stratum. I reviewed these numbers and determined that they were sufficient to proceed with the Clinical reviews. In particular, at this point, there were enough so that the randomness and reliability of the conclusions would not be materially impaired.
7. LACK OF BIAS
All four samples are simple random, meaning that each child in each group had an exactly equal chance of being selected. My conclusion is therefore that the sampling process is fair, or in survey jargon, unbiased or valid. More specifically, this sampling process was not designed to make any particular point or case, but rather to obtain a fair representation of the average class members experiences with specialized services. That is the primary criterion for what constitutes a good sample. I therefore conclude that the samples drawn according the foregoing process are good, valid, and unbiased samples.
Nonetheless, we have a further method available for testing the adequacy and fairness of the samples. For the children in the samples and the populations, we were able to obtain age and gender information. These two simple demographic variables enabled us to ask the question: Are our samples exactly like the populations from which they were drawn?
The table below shows the results for Age.
Group
Population
Sample Draw
Average Age, PopulationAverage Age, SampleSignificant Difference?*Crisis7656013.7413.53No, p=.70Home6264511.6612.45No, p=.19Inpatient4984514.3614.39No, p=.94Residential13374512.3110.25No, p=.25Total322616513.0913.32No, p=.70* We employed the Students t-test for differences in means between two independent groups. The p in the equations stands for probability, and the number in the equation would only be considered statistically significant if it were less than .05 or .01, depending on the researcher and the nature of the task.
In all four stratified samples, and for the total, there are no significant differences between population and sample on the Age variable. This shows that the samples are unbiased in this regard.
The results for Gender are analyzed with different statistics, and the results are shown below. The simplest way to present the findings is to show Percent Male or Percent Female. This table answers the question Are any of the samples different at all in Gender composition?
GroupPercent Male in the PopulationPercent Male in the SampleIs the Difference Significant?*Crisis53.8%51.1%No, p=.763Home61.9%59.1%No, p=.416Inpatient54.2%75.0%No, p=.627Residential53.2%52.6%No, p=1.000Total56.3%54.9%No, p=.829* We tested significance using Chi-Square, with the Continuity Correction for 2x2 tables. In each case, we also checked other commonly employed statistics, including Likelihood Ratio and Fishers Exact Test, and found no significance anywhere.
These sample validation findings on the samples strongly support the conclusion that the samples are unbiased. In sum, we set out to select four samples that were not in any way tilted toward any preconceived conclusions, and the sample validation data demonstrate that the process worked exactly as intended.
Therefore, it is my opinion that what is true for these samples, is very likely to be true for the populations which they are intended to represent. Although the samples are small, the conclusions reached on the basis of the samples will be even more valid if the conclusions are strong, prevalent, frequent, or dominant.
8. PRECISION
Precision is another factor that is used in sampling theory to judge a sample. Surprisingly, the precision of a sample is primarily related to the size of the sample, and not primarily to the percentage of the population included. Precision is relatively independent of the size of the population, or the percentage drawn from the population. A sample of 400 people out of 10,000 has almost exactly the same precision as a sample of 400 people out of 250,000,000. This seems counterintuitive, but it does arise inevitably from the mathematics of sampling.2
One of the most often heard sampling statistics turns up with political polls and USA Today surveys: margin of error. We often hear or read that such and such a survey has a 5% margin of error. This is shorthand for a much longer and more complicated statement: The sample this survey was based on would produce estimates of population characteristics that would be within 5% of the correct value in 903 out of 100 tries.
To obtain this 5% margin of error (MOE) level of precision in the current situation would require sample sizes as follows:
Group
Population SizeSample required for 5% MOE 95 times out of 100Crisis765270Home626250Inpatient498230Residential1337320Total32261070
It is clear from this table that the usual guidelines for large scale sampling, like political polling, cannot reasonably be applied here as a practical matter. It would be necessary to draw nearly a third of the population into the sample. Resources in most situations would not permit this, and if they did, it would be preferable not to sample at all, but to avoid sampling entirely and include the entire population.
Therefore, in my opinion the only sensible and feasible course of action was to draw samples as large as possible within finite resources, but above all other considerations, make sure the samples were and are fair (unbiased). Such samples have less precision than large samples, but they are fair, useful, and informative. Therefore, one can reasonably rely on the conclusions of the clinical experts that are based on reviews of the individuals in the sample. Reliability is an often used term for our confidence in relying upon any given findings from a scientific sample. Moreover, if large or prevalent effects are found, then very high confidence in the findings is justified.
The next table shows the disposition of Plaintiffs samples, along with estimates of sample margin of error.4
Group
Population Size
Sample SizeMargin of Error in 90 Out of 100 TriesNumber of Clinical Reviews PlannedNumber of Clinical Reviews Completed
Margin of Error for the Small Sample of Clinical ReviewsCrisis7653016.6%10740.6%Home6264513.1%10837.1%Inpatient4984512.9%10837.1%Residential13374513.3%151228.8%Total32261656.6%453515.5%
For the purpose of evaluating services in the context of litigation via expert observations, this sample design and numbers are, in my opinion, acceptable. In other words, this sampling technique falls within generally accepted standards of survey sampling methods.
The high margins of error on the right of the table above need to be interpreted with caution. They really mean, in English, that if we drew 100 samples of this size, then 90 times out of 100, our estimates from the sample would be within 15.5 % of the true population value, and most would be considerably more accurate than that. Combined with our proof of lack of bias, the data from the clinical reviews can and should be given a great deal of credence.
9. CONCLUSION
For the entire effort of 35 random clinical reviews, plus 8 more for the named plaintiffs, I strongly believe that sufficient reliable information will be obtained in order to draw firm conclusions about the overall experiences of children in the Massachusetts mental health care system.
The most important facet of these samples was that they are unbiased, or in other words, fair. Whatever the clinicians find in their review of mental health service needs is therefore not the result of sampling certain kinds of people to make a point. The samples are fair representations of their respective populations. These samples, being small, cannot be expected to yield exact conclusions they were designed to produce approximate, but entirely fair, inferences.
In the present case, the samples are well designed and credible for the given purposes. Moreover, because we found that the samples mirrored the populations on two key demographic variables, we know the samples were not biased. Thus they are generally fair, and this allows us to have substantial confidence that clinical findings about the sample are fairly representative of the population or class.
More importantly, if the clinicians find things that are true for most of the children in the samples, then one would be quite justified to conclude that the same is true for most of the children in the class. Whether most means 70% or 90% is relatively unimportant. The samples are designed to, and adequate to, identify dominant and prevalent patterns, rather than exact percentage estimates of population parameters. The present sampling approach is fair, and quite adequate to find large or prevalent patterns of service delivery to the class members.
Moreover, I believe the findings from this study design will be very useful in estimating the magnitude of the remedies that may be needed in order to produce better services and better outcomes for children. How many children will need what? Because these samples were drawn at random, and without bias, they should enable us to make rough, but useful, projections about the type and amounts of services needed in the future.
In my opinion, the clinical study samples will yield information that is, both qualitatively and quantitatively, superior to any information that is otherwise available on the issue of childrens mental health services in Massachusetts of which I am aware.
In everyday language, precision means the risk of some stated percentage of error. The usual survey jargon makes statements such as 90% of samples drawn in exactly this manner would produce estimates that are within plus or minus 5% of the true population value.
2 Kish, L. (1965). Survey Sampling. New York: Wiley.
3 Or 95, or 99, as the survey implementers decide.
4 Using the 90 times out of 100 criterion.
Conroy Disclosure on Sampling for Rosie D., Page PAGE 10
6Gu #/ 0
c|y89{egCVW}$$$%&&)****,,,,~--
. .
/!/7/:/>0o094=4g6ǶǶǶǶǶǰǶǰ>*CJPJ0J5>*CJPJ>*PJPJCJPJ0J7CJ0J5>*CJ5CJj0JCJU7CJ>*CJCJCJ
5>*CJ$5CJ$E%Huvn "sbc{~
}xy$$$$$$$%Huvn "sbc{~
}xy7>?B ZfghOIBCU"$$$$$$$%%%3%4%;%?%B%E%H%I%N%R%V%W%Z%[%e%i%m%n%q%r%~%%%%%%%%%%%%'
`7>?B Ġvpvvx:$$TlFJh$$$$$$$$$:$$TlFJh
ZfghOIBCU"$$$$$:$$TlFJh$$$$$$$$$$$$$$%%%3%4%;%?%B%E%H%T$$$P$$Tra
%i4$$$$H%I%N%R%V%W%Z%[%e%i%m%n%q%r%~%%%%%H\d$$$$$$$P$$Tra
%i4%%%%%%%%'')***,
..<
!
!$$$$$$$P$$Tra
%i4'')***,
...01&2'2-2.292:2F2G2_2s222222222222222222222222 33333'3(3.33373=3C3M3N344O5g6h6i6o666666666666677777'7(747:7@7L7M7S7Y7_7j7k7
^..01&2'2-2.292:2F2G2_2s2222$$$[$$TֈV
i!^)$$$$$$22222222222222222$$$[$$TֈV
i!^)$$$$$$22222 33333'3(3.33373=3C3$$$$$$$[$$TֈV
i!^)$$C3M3N344O5g6h6i6o6666 $$
!$
!$[$$TֈV
i!^)$$g6h66699::;;!="=>>S??@@AAAAAdBmBBBBBB2CmCDDDEFF5H;HeHiHI,IK
KKK(L*LLLN
NsNwN~RRRRRRSSSSSST TLT|TCJ0J5
0J5>*
j0JUCJ0JCJPJ0J5CJPJ5CJPJ0JCJPJ>*CJPJ0J5>*CJPJCJPJPJG6666666666777|hthE$$Tl\{{{{$$$$$E$$Tl\{{{{777'7(747:7@7L7M7S7Y7_7j7x$$$E$$Tl\{{{{$$
j7k7`8a89::::#=>R?S?T?Z?[?k??$$$$$$$$$$$$$E$$Tl\{{{{k7`8a89::::#=>R?S?T?Z?[?k????????????????????????ARDDDDDDDDDDE3EXEYEZEEEEEEEEEEEEEEEEEEEEEEEEEEEFF FFFFFF"F'F+F0F3F6FF
a??????????????@w8wLwX<$$T.F>
\bM $$$$$$$<$$T.F>
\bM
??????????ARDDDwuuup$$<$$T.F>
\bM <$$T.F>
\bM $$$$$$$DDDDDDDDDE3EXEYEZEE$$$$$$$$$$$EEEEEEEEE$$$$$$$$$h$$T.֞ M
x !FEEEEEEEEEEEEEEEEx$$$$$$$$$h$$T.֞ M
x !FEEEFF FFFFFF"F'F+F0F3F$$$$$$$h$$T.֞ M
x !F$$3F6FFKGIII+I,IMJ*LMOQH$7$8$h$$T.֞ M
x !F$$>FKGIII+I,IMJ*LMOQRRRSSSTKTTTTT
QRRRSSSTKTLTTTTTT$|T}T~TTTTTTTTTCJCJ0JCJmH0JCJj0JCJUCJ
00P/ =!"#$%...)()()()()()
[$@$NormalmH @`@ Heading 1$$H$7$8$@&5CJ <A@<Default Paragraph FontP&`PFootnote Reference@CJH*OJQJ^JaJDoDDeltaView Insertion
5>*@\<o<DeltaView Deletion7@8`"8HeaderH$7$8$
!CJ2B`22 Body Text$H$7$8$8`B8
Footnote TextH$7$8$&)`Q&Page Number8 `b8FooterH$7$8$
!CJ!9:@P
DxP???Bg6|TT+8F $H%%.22C367j7??DEEE3FQT,./01245679:;=>?@ABCE'k7>FT-3<D18;B!_DV_M22_DV_M23_DV_M24_DV_M25_DV_M26_DV_M27_DV_M28_DV_M29_DV_M30_DV_M31_DV_C8_DV_M32_DV_M33_DV_C9_DV_C10_DV_M35_DV_C12_DV_C14_DV_M36_DV_C15_DV_M37_DV_C16_DV_M38_DV_M39_DV_M40_DV_C19_DV_C20_DV_M41_DV_M42_DV_M43_DV_M44_DV_M45_DV_M46_DV_M47_DV_M48_DV_M49_DV_M50_DV_M51_DV_M52_DV_C22_DV_C23_DV_M53_DV_C25_DV_M54_DV_M55_DV_M56_DV_C27_DV_M57_DV_C28_DV_C29_DV_M58_DV_C30_DV_M59_DV_C32_DV_M60_DV_M61_DV_M62_DV_M63_DV_M64_DV_M65_DV_M66_DV_C40_DV_M67_DV_M68_DV_M69_DV_C42_DV_M70_DV_M71_DV_M72_DV_C44_DV_M73_DV_M74_DV_M75_DV_M76_DV_M77_DV_M78_DV_M79_DV_M80_DV_C46_DV_M81_DV_M82_DV_M83_DV_C49_DV_M84_DV_C50_DV_M85_DV_C52_DV_M86_DV_M87_DV_M88_DV_C54_DV_M89_DV_M90_DV_C58_DV_M91_DV_M92_DV_M93_DV_M94_DV_C64_DV_M95_DV_C66_DV_M97_DV_M98_DV_M99_DV_M100_DV_M101_DV_M102 OLE_LINK1_DV_M103_DV_M104_DV_M105_DV_C69_DV_M106_DV_C70_DV_M107_DV_C72_DV_M108_DV_C74_DV_C75_DV_M109_DV_C76_DV_M112_DV_M113_DV_M114_DV_C82_DV_M116_DV_M117_DV_M118_DV_M119_DV_M120_DV_M121_DV_M122_DV_C87_DV_M123_DV_M124_DV_M125_DV_C90_DV_M126_DV_C92_DV_M127_DV_C93_DV_M128_DV_C95_DV_M129_DV_C96_DV_C97_DV_M130_DV_C98_DV_M131_DV_M132_DV_M133_DV_C102_DV_M134_DV_M135_DV_M136 OLE_LINK2_DV_M137_DV_C104_DV_M138_DV_M139_DV_M140_DV_M141_DV_M142_DV_M143_DV_M144_DV_C106_DV_M145_DV_C107_DV_M146_DV_M147_DV_M148_DV_C110_DV_M150_DV_M151_DV_C113_DV_M152_DV_C114_DV_C117_DV_M156_DV_C119_DV_M158%Hvn "sx)/07iWc{~ }y
hOI:I[]e%@CUW3s } !#%&&&&&'(((((~))***!++:,o,,,,,,X-d---N/0O1h2k3a4h4444444?55556647?7[7#9::;;d<<<b======d>m>>>>>>>>??2?m?@R@@>BBBKCEE,EEMFKG,HHHHIJKMMMMNLN~NNP
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~%Hvn "sx)077c{~ }y
hOIII]e%@CU33s !#%&&&&&'((((())**!+!++o,o,,,,,,d-d---N/0O1k3k3a4h444444455566647[7[7#9::;;d<<<b======m>m>>>>>>>>??m?m?@R@=B>BBBKCEE,EEMF)H,HHHHI
KKMMMMLNLNNNP6;w{ $< B
"";&H&((33NOOLPnPsPPP$F]=A5 A 9</n)v)--..2256{99NOOOOP'PLPPPSteven Schwartz.C:\Wp51\ROSIE\Conroy Sampling Report.final.doc @.hh^h`@.hh^h`@.0@.@,ENa,E,EPP@G:Times New Roman5Symbol3&:ArialI&??Arial Unicode MS"qh⊆⊆`@!!20OStatement of James WSteven SchwartzSteven Schwartz
Oh+'0x
4@
LX`hpStatement of James W8tatSteven SchwartztevNormal.dotaSteven Schwartz1evMicrosoft Word 8.0W@
@Rp @j}`@
՜.+,D՜.+,@hp|
CPR!O
Statement of James WTitle 6>
_PID_GUIDAN{2DF85447-28D3-11D9-B009-000625168136}
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGIJKLMNOPQRSTUVWYZ[\]^_abcdefgjRoot Entry Fxб]l1TableHWordDocumentSummaryInformation(XDocumentSummaryInformation8`CompObjj
FMicrosoft Word Document
MSWordDocWord.Document.89q