|Ken Shirriff -> AIDS theories -> The Florida dentist|
There has been controversy about this conclusion, with some people claiming that the patient's virus wasn't any closer than the local control viruses.
I decided to look at the DNA sequences myself (you can get them from ncbi.nlm.nih.gov). This data includes sequences from the dentist, from patients A,B,C,D,E,F,G,H, and from 35 local residents. I looked at a sequence of 40 envelope amino acids to see how they compared among the different strains.
Different strains from the dentist had 36 to 40 of the 40 amino acids matching when compared with each other. The dentist and A had 37 to 39 matching. The dentist and B had 37 to 39 matching. The dentist and C had 38 to 40 matching. The dentist and D had 30 to 31 matching. The dentist and E had 37 to 38 matching. The dentist and F had 31 to 33 matching. The dentist and G had 36 to 38 matching. The dentist and H had 28 to 30 matching. The dentist and locals had an average of 32 matching, with the best match being 34 to 36.
It should be clear from these numbers that the dentist's strains are very close to A,B,C,E, and G; they are as similar as different mutations in the dentist. The dentist's strains are not so close to D, F, H, or the locals.
This is, of course, not at all rigorous. I just wanted to see if the original study seemed believable to me when I looked at the data myself. It makes me inclined to believe the claims that 5 of the patients were infected by the dentist.
Now, the idea is that as things mutate, you end up with changes in the DNA sequence and changes in the amino acid sequences. For example, you can look at the amino acid sequence in hemoglobin in a bunch of different species, look at how closely related they are, and determine how the species evolved from each other.
What the HIV study looked at was part of the envelope of the virus that mutates very rapidly. The envelope is what antibodies attack, so by changing the envelope, the virus can evade antibodies, which is why a vaccine is so hard to make. So, they take the virus, sequence the DNA that specifies the amino acids in the appropriate part of the envelope, and then determine the amino acid sequence from the DNA sequence.
Now, I took these sequences and looked at a subsequence of 40 amino acids out of it.
Dentist, sample 1: EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH Patient A, sample 1: EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIR Local #13: EVVIRSENFSDNAKTIIVQLKESVEINCTRPNNNTRKRITThen it's just a matter of counting up the matches, so see how closely related they are (39 matches out of 40 in this case between the dentist and A). This comparison can be done more rigorously, of course, which the original paper did. I just wanted to take a quick look. The paper also looked at the DNA sequences as well as the amino acid sequences.
So the assumption is that the less related the two samples are, the more differences there will be between them. This isn't strictly true, since two unrelated samples could happen to mutate to get closer by chance.
HIV keeps mutating in the body, avoiding antibodies, getting AZT resistance, and things like that. So even different samples taken from the same person will have slightly different sequences. There were 6 samples from the dentist, so depending on which you looked at, there would be a different number of matches. The dentist's own samples had 36 to 40 matches when compared with each other. This also explains why the match numbers are ranges; the value depends on which dentist value is used.
M90848-dentist EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH M90849-dentist EIVIRSANFTDNAKIIIVQLNASVEIDCTRPNNNTRKGIH M90850-dentist EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH M90851-dentist EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNYTRKGIR M90852-dentist EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH M90853-dentist EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNYTRKGIR M90855-patientA EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIR M90862-patientB EIVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH M90877-patientC EVVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIH M90882-patientD EVVIRSANFSDNAKTIIVQLNKSVNITCVRPNNNTRESIP M90893-patientE EIVIRSANFTDNAKIIIVQLNASVEINCTRPNNNTRKGIN M90895-patientF EVVIRSENFMDNVKTIIVQLNESVQINCTRPNNNTRKSIH M90902-patientG EVVIRSANFTDNAKIIIVQLNAPVEINCTRPNNNTRGGIH M90907-patientH EVIIRSENFTDNAKTIIVQLNATINIICERPHNNTRKSIH M90916-control EVVIRSENFTNNAKIIIVHLNKTVNITCTRPNNNTRRSIP M90917-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPNNYTGKRLS M90918-control EIVIRSANFTDNTKIIIVQLNESVKINCTRPSNNTRKSIP M90919-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPNNYTGKRLS M90920-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPSNYTGKRLS M90921-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPSNNTRKSIP M90922-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPSNNTRKSIP M90924-control EVVIRSENFTDNTKTIIVQLNTSVTINCTRPGNNTRKSIT M90925-control EVIIRSENFTDNTKTIIVQLNTSVTINCTRPGNNTRKSIT M90926-control EVVIRSENFTDNTKTIIIQLNTSVTINCTRPGNNTRKSIT M90927-control EVVIRSENFTNNAKTIIVQLNTSVTINCTRPGNNTRKSIT M90928-control EIVIRSANFTDNTKIIIVQLNESVEINCTRPSNNTSKSIH M90929-control EVVIRSENFTNNAKTIIVQLNTSVTINCTRPGNNTRKSIT M90932-control EVVIRSENFTNNAKTIIVQLKESVKINCTRPNNNTRKSIN M90933-control EVVIRSENFTDNAKTIIVQLNNSVVINCTRPNNNTRRSVH M90934-control EVVIRSENFTNNAKTIIVQLKESVKINCIRPNNNTRRSIN M90936-control EVVIRSENFTNNSKTIIVQLKESVVINCTRPNNNTRRSIH M90938-control EVVIRSENFSDNAKTIIVQLKESVEINCTRPNNNTRKRIT M90939-control EVVIRSENFTNNAKTIIVQLNVSVEINCTRPNNNTRKGIH M90940-control EVVIRSENFTDNAKTIIVQLKEPVEINCTRPSNNTRKGIP M90943-control EVVIRSDNFTDNVKTIIVQLNEAVVINCTRPNNNTRRGIH M90944-control EVVIRSENFTDNAKTIIVQLNESIEINCTRPNNNTRKSIP M90945-control EVIIRSENLTDNAKTIIVQLKEPVIINCTRPNNNTRKSIH M90950-control EVVIRSENISDNAKTIIVQLNESVVINCTRPNNNTRRSIH M90951-control EVVIRSDNFSDNARTIIVQLNESVVINCTRPNNNTSRRIS M90952-control EVVIRSENFTDNAKTIIVQLNQSVEINCTRPNNNTRRSIH M90957-control EIVIRSENFTNNARTIIVHLNESIVINCTRPNNNTGKSIH M90959-control EIVIRSDNFTDNAKTIIVQLNQTVEINCTRPNNNTRKSIH M90962-control EVVIRSKNFTDNAKTIIVQLNESVAINCSRPNNNTRKGIH M90963-control EIVIRSENFTDNLKNIIVQLKEPVEINCTRPGNNTRRSIHYou can simply count the matches to see where I got my numbers; this isn't rocket science. As mentioned above, I only used a short part of the sequence. You can get all the original sequences from ncbi.nlm.nih.gov if you want to do comparisons on your own. alsdkfjasldf asdfa sfasd f asd fa sd f asdfasdflkasdfj alskdfj lasdkfj asldkfjadsf asdfasdf