******  CHECK_DATA.TXT --- 22 Dec 2003  *******
**

Retrieve the empirical data sets and make sure all data analysis
software is up to date and all empirical stats are produced by
this same software.


=====  Category Rating Experiment  ======

The files C:\Experiments\CatRat1\data\data_by_sbj.mat and
C:\Experiments\CatRat1\data\CatRat1.mat  contain raw data for
65 category-rating subjects.  24 of them, however, belong to
experimental groups 3 and 4 (autocorrelated schedules, practically
uniform). One subject is discarded, which leaves 40 "good" subjects
for the Category-Rating experiment.  Their data are stored in
the file C:\work\anchor\finalsim\CR40data_by_sbj.mat:

 cd 'C:\work\anchor\finalsim\' ; load CR40data_by_sbj
 whos
  S_CR40          1x40        819848  struct array
  rawS_CR40       1x40        735488  struct array

 rawS_CR40       % generated by \CatRat1\data\data_by_sbj.m
1x40 struct array with fields:
    sbj_number
    group
    data
    missing

 S_CR40          % generated by \CatRat1\data\preprocess.m
1x40 struct array with fields:
    sbj_number
    group
    miss_idx
    outl_idx
    RT
    dist
    resp
    resp_distr
    D_regr
    resid
    trend

Note that the missing responses in S_CR40 are filled in (and hence
corrcoef(stim,resp) over this data overestimates the true accuracy).
The outliers are studentized at +/- 3*sigma.


 sbj=[S_CR40(:).sbj_number]
sbj = [3  4  9 10 11 12 17 18 20 23 24 28 ...
      29 34 35 36 38 41 42 43 45 49 50 51 ...
      52 53 54 55 56 57 58 59 60 61 62 63 ...
      64 65 66 67 ]

 xtab1([S_CR40(:).group]')
   Value   Count  Percent  Cum_cnt  Cum_pct
-------------------------------------------
       1      20    50.00       20    50.00
       2      20    50.00       40   100.00
-------------------------------------------


The file C:\work\anchor\finalsim\CR40mdata.mat contains the
stimulus-response sequences only, in a format suitable for 
passing to ANCHOR2 and PARAMSEARCH.

 load CR40mdata
CR40mdata 1x40 (295424 bytes) struct array with fields:
    stim: [450x1 double]
    resp: [450x1 double]


Make sure that the two data sets match. The 'round' operator is
needed to conform a few regression-estimated outliers to the scale.

 ok=zeros(1,40);for k=1:40 ok(k)=all(CR40mdata(k).stim==S_CR40(k).dist);end;all(ok)
 ok=zeros(1,40);for k=1:40 ok(k)=all(CR40mdata(k).resp==round(S_CR40(k).resp));end;all(ok)


The function CR_STATS.M calculates a battery of statistics from 
outlier-free S-R sequences. It can be applied both to (filled-in)
empirical data and simulated data. FULL_ANALYSIS.M, on the other
hand, does full empirical analysis with outlier correction. The 
file C:\work\anchor\finalsim\estatsCR40.mat  contains the 
resulting empirical statistics for the 40 CR observers.

 load estatsCR40
  estats1CR40       1x1          11212  struct array
  estatsCR40        1x1          17200  struct array


ESTATS1CR40 are generated by CR_STATS.M. They overestimate the
true accuracy a little because outliers are filled-in accurately.

 estats1CR40 = 
    resp_mean: [40x1 double]
     resp_std: [40x1 double]
           R2: [40x1 double]
       blk_R2: [40x5 double]
      blk_std: [40x5 double]
       deltaS: [40x1 double]
      deltaR2: [40x1 double]
          ARL: [40x10 double]
      context: [40x1 double]
     deltaARL: [40x1 double]
     resp_acf: [40x1 double]
    resid_acf: [40x1 double]
    resid_std: [40x1 double]

 s40=[CR40mdata(:).stim];r40=[CR40mdata(:).resp];
 st = CR_stats(s40,r40) ; all(st.resid_acf==estats1CR40.resid_acf)


ESTATSCR40 are generated by FULL_ANALYSIS.M, plus some manual
rearrangement. Missing data are ignored whenever possible.

 estatsCR40
    sbj_number: [40x1 double]
         group: [40x1 double]
     resp_mean: [40x1 double]
      resp_std: [40x1 double]
            R2: [40x1 double]
        blk_R2: [40x5 double]
       blk_std: [40x5 double]
       deltaR2: [40x1 double]
        deltaS: [40x1 double]
           ARL: [40x10 double]
          dARL: [40x10 double]
             d: [40x1 double]
            dd: [40x1 double]
      deltaARL: [40x1 double]
      resp_acf: [40x1 double]
     resid_acf: [40x1 double]
     resid_std: [40x1 double]
       mean_RT: [40x1 double]
        std_RT: [40x1 double]
       context: [40x1 double]


The function CR_SUMMARY.M aggregates over subjects or model runs.
The first two columns always give the empirical mean and std for
comparison. The remaining columns are the new results. In the
printout below the two match as the function is called on the
empirical stats themselves.

 CRsumstats=CR_summary(estatsCR40)
                e.mean  e.std  mean   std     min  median  max
------------------------------------------------------------------
    resp_mean: [5.5580 0.5180 5.5577 0.5181 4.6013 5.5336 6.7472]
     resp_std: [1.7720 0.2440 1.7717 0.2443 1.1049 1.7598 2.3127]
           R2: [0.7750 0.0820 0.7748 0.0825 0.5734 0.7825 0.9052]
       deltaS: [0.5533 0.3520 0.5533 0.3521 -0.0441 0.5300 1.2584]
      deltaR2: [-0.0160 0.1000 -0.0161 0.1002 -0.2518 0.0050 0.1539]
     deltaARL: [0.4870 0.5830 0.4867 0.5830 -0.7772 0.4613 1.6566]
            d: [0.2080 0.4960 <groups messed up>]
           d1: [0.4510 0.3440 <groups messed up>]
           d2: [-0.0350 0.5120 <groups messed up>]
     resp_acf: [0.1950 0.1000 0.1955 0.1002 0.0018 0.1921 0.4035]
    resid_acf: [0.3430 0.1160 0.3435 0.1161 0.1780 0.3385 0.6749]
    resid_std: [0.8090 0.1240 0.8090 0.1244 0.6222 0.7850 1.1355]
    mn_blk_R2: [0.8184 0.7640 0.8224 0.7665 0.8023]
    mn_blkstd: [2.0808 1.5040 1.6476 1.3537 1.5275]
       mn_ARL: [5.2150 5.3207 5.5715 5.4951 5.4114 5.3984 5.3861 5.6141 5.6474 5.5782 ;
                5.1926 5.3693 5.5924 5.6454 5.6183 5.7638 5.8705 6.0121 5.8973 5.9215 ]

The empirical column in Table 3 (\ref{tab:CRExpSum})
are based on these CRSUMSTATS.


@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

=====  Absolute Identification Experiment  ======

The file C:\Experiments\Miller1\data\S.mat contains raw data for
24 absolute-identification subjects, processed via MILLER_ANALYSIS.M
The same struct is in C:\work\anchor\finalsim\AI24data_by_sbj.mat.

 cd 'C:\work\anchor\finalsim\' ; load AI24data_by_sbj
 S_AI24          % 471136 bytes
1x24 struct array with fields:
    sbj_number
    group
    miss_idx
    dist
    categ
    resp
    RT
    interv
    xtab

 s24=[S_AI24(:).dist];r24=[S_AI24(:).resp];
 xtab2(r24(:),s24(:))
         |   275   325   375   425   475   525   575   625   675 | Total Percent
---------+-------------------------------------------------------+--------------
miss  -1 |     7    12    20    19    22    19    32    22    11 |   164     1.5
       1 |   804    52     2     1                       1       |   860     8.0
       2 |   350   704   111     8     2                         |  1175    10.9
       3 |    32   365   613   177    19     3     2             |  1211    11.2
       4 |     4    58   371   576   221    31    11     5     2 |  1279    11.8
       5 |     2     9    72   360   618   277    46    12     1 |  1397    12.9
       6 |     1          11    55   288   607   340    68    11 |  1381    12.8
       7 |                       3    27   242   568   357    80 |  1277    11.8
       8 |                       1     2    20   188   584   385 |  1180    10.9
       9 |                             1     1    13   151   710 |   876     8.1
---------+-------------------------------------------------------+--------------
 Total   |  1200  1200  1200  1200  1200  1200  1200  1200  1200 | 10800
 Percent |  11.1  11.1  11.1  11.1  11.1  11.1  11.1  11.1  11.1 |         100.0

Table 2 (\ref{tab:AIErrors}) is based on this information.


The file C:\work\anchor\finalsim\AI24mdata.mat contains the
stimulus-response sequences only, in a format suitable for 
passing to ANCHOR2 and PARAMSEARCH. Missing responses are filled
in with the correct category.

 load AI24mdata
AI24mdata 1x24 (265920 bytes) struct array with fields:
    stim
    resp
    fdbk

Make sure the two sets match (except for missing values)
 foo=[AI24mdata(:).stim];all(foo(:)==s24(:))
 foo=[AI24mdata(:).resp];xtab2(foo(:),r24(:))


The function AI_STATS.M calculates a battery of statistics from 
(complete) S-R sequences. It can be applied both to (filled-in)
empirical data and simulated data. MILLER_ANALYSIS.M, on the 
other hand, does full empirical analysis with missing values.
The file C:\work\anchor\finalsim\estatsAI24.mat  contains the 
resulting empirical statistics for the 24 AI observers.

 load estatsAI24
  estats1AI24       1x1          15920  struct array
  estatsAI24        1x1          21396  struct array

ESTATS1AI24 are generated by AI_STATS.M. They overestimate the
true accuracy a little because missing responses are filled-in.

 estats1AI24
     resp_mean: [24x1 double]
      resp_std: [24x1 double]
            R2: [24x1 double]
             T: [24x1 double]
         scale: [24x9 double]
         stdev: [24x9 double]
         Pcorr: [24x9 double]
        dprime: [24x8 double]
           bow: [24x1 double]
       blk_acc: [24x5 double]
       blk_std: [24x5 double]
            pr: [24x1 double]
        deltaS: [24x1 double]
           rep: [24x3 double]
           ARL: [24x10 double]
       context: [24x1 double]
      deltaARL: [24x1 double]
    sbj_number: [24x1 double]
         group: [24x1 double]
       missing: [24x1 double]


ESTATSAI24 are generated by MILLER_ANALYSIS.M, plus some manual
rearrangement. Missing data are ignored whenever possible.

 estatsAI24
     resp_mean: [24x1 double]
      resp_std: [24x1 double]
            R2: [24x1 double]
             T: [24x1 double]
         scale: [24x9 double]
         stdev: [24x9 double]
         Pcorr: [24x9 double]
        dprime: [24x8 double]
           bow: [24x1 double]
       blk_acc: [24x5 double]
       blk_std: [24x5 double]
            pr: [24x1 double]
        deltaS: [24x1 double]
           rep: [24x3 double]
           ARL: [24x10 double]
       context: [24x1 double]
      deltaARL: [24x1 double]
    sbj_number: [24x1 double]
         group: [24x1 double]
       missing: [24x1 double]
          hist: [24x10 double]
      exponent: [24x1 double]
      acf_resp: [24x1 double]
     acf_resid: [24x1 double]
       mean_RT: [24x1 double]
        RT_std: [24x1 double]
       RT_prof: [24x9 double]


The function AI_SUMMARY.M aggregates over subjects or model runs.
The first two columns always give the empirical mean and std for
comparison. The remaining columns are the new results. In the
printout below the two match as the function is called on the
empirical stats themselves.

 AIsumstats=AI_summary(estatsAI24)
                e.mean  e.std  mean   std     min  median  max
------------------------------------------------------------------
    resp_mean: [5.0300 0.1180 5.0295 0.1179 4.7891 5.0254 5.3220]
     resp_std: [2.4040 0.0870 2.4043 0.0867 2.2614 2.4024 2.5745]
           R2: [0.9010 0.0350 0.9007 0.0345 0.8385 0.8954 0.9564]
            T: [1.6850 0.2070 1.6854 0.2072 1.3648 1.6405 2.1227]
          bow: [0.1420 0.4030 0.1416 0.4033 -0.3500 0.0700 1.0267]
           pr: [0.0650 0.1230 0.0648 0.1226 -0.1222 0.0444 0.3667]
       deltaS: [0.0220 0.1580 0.0218 0.1578 -0.2702 0.0019 0.2934]
     deltaARL: [0.0200 0.2220 0.0204 0.2225 -0.3652 -0.0471 0.5441]
          rep: [0.1150 0.0860 0.1107 0.0836 -0.0691 0.1016 0.2530]
         rrep: [0.6460 0.0900 0.6493 0.0855 0.5263 0.6565 0.8070]
         nrep: [0.5310 0.0840 0.5386 0.0833 0.3925 0.5247 0.7303]
            d: [-0.1450 0.2440 <groups messed up>]
           d1: [-0.2330 0.2150 <groups messed up>]
           d2: [-0.0570 0.2490 <groups messed up>]
     mn_scale: [1.3677 2.3847 3.3687 4.2437 5.0749 5.9470 6.7777 7.6536 8.5025]
     mn_stdev: [0.5186 0.6279 0.7378 0.7721 0.7542 0.7633 0.8007 0.8100 0.6511]
     mn_Pcorr: [0.6742 0.5920 0.5187 0.4883 0.5246 0.5143 0.4854 0.4939 0.5976]
    mn_dprime: [2.2680 1.7870 1.3831 1.2881 1.3087 1.2362 1.3397 1.4457]
    mn_blkacc: [0.4889 0.5208 0.5384 0.5759 0.5537]
    mn_blkstd: [2.4287 2.0701 2.3829 2.0663 2.4070]
       mn_ARL: [5.0231 5.0595 5.1370 5.1122 5.0399 5.0596 5.0258 5.0177 5.0754 5.0409 ;
                4.9906 5.0529 5.0713 5.1018 4.9492 5.0265 5.0060 5.0223 5.0449 5.0466 ]

The empirical column in Table 1 (\ref{tab:AIExpSum})
are based on these AISUMSTATS.
