Friday, February 5, 2016

#ParsimonyGate: The Perspective of a Reformed ‘Hardcore’ Cladist


If you are reading this article you have probably read the now infamous editorial in the journal Cladistics http://onlinelibrary.wiley.com/doi/10.1111/cla.12148/full . Although signed “The Editors,” it isn’t clear if this was approved by anyone besides the current Head Editor who is most certainly a “hardcore cladist” (someone who thinks parsimony is the most reasonable, if not only, tool for inferring the historical relationships of organisms through phylogenetics). I distinguish between “hardcore cladists” and just “cladists,” because I think I am a cladist and that most other systematic biologists are too. A cladist in my personal definition is anyone who is distinguishing between pleisiomorphic (“primitive” features shared with a designated outgroup) versus apomorphic (derived characters distinct from the outgroup condition). Under my broader definition, basically everyone doing morphological or molecular work to discover the relationships of organisms is a cladist, and it doesn't matter if you are using parsimony, likelihood, or Bayesian approaches. The only exception are folks that are using overall similarity (e.g., bats and birds are close relatives because they both have warm blood and wings) which doesn’t distinguish analogy (convergence of characters) versus homology (characters derived from common ancestry) because it doesn’t follow the Henningian, or cladistic principal, of distinguishing between pleisiomorphic versus apomorphic characters.
            This view of cladistics I outline above are basically the foundation of Willi Hennig’s 1966 book “Phylogenetic Systematics” that is the bible of the Willi Hennig Society (publisher of Cladistics) and the foundation of modern systematic theory. At the time the idea of distinguishing between pleisiomorphy versus apomorphy was radical. Famed evolutionary biologist Ernst Mayr was the first to call those following Hennig’s principles “cladists” - as a pejorative by the way. Mayr preferred doing “evolutionary taxonomy” - basically where the expert on a group makes a hypothesis about the relationships of organisms based on characters they think are most important for supporting those relationships (e.g., owls, eagles and hawks are all each other’s closest relatives because these “raptors” all kill with their feet). The other alternative method in systematics in those early days were the numerical pheneticists that used overall similarity to group organisms as I explain above. (Read more about this interesting time in history in David Hull’s, “Science as a Process.”) The original cladists weren’t fighting for parsimony, they were fighting to only use derived characters in phylogenetics. Parsimony came around a little later with the work of several groups mainly from the University of Michigan and American Museum of Natural History. Parsimony was the only game in town to the early cladists, which was mainly for understanding the transition of morphological characters from primitive to derived. Then with the rise of molecular tools for obtaining DNA characters came new methods for inferring trees: model-based approaches including maximum likelihood and eventually Bayesian inference. Systematists of all sorts would meet at the annual Systematic Zoology/Biology meetings every year until the “hardcore cladists” decided to break away and have their own meeting, the meeting of the Willi Hennig Society founded in 1980.
            Now I should mention I trained as a systematist at both the University of Michigan and the American Museum of Natural History, the hot bed of cladistics and Hennig worship, albeit late in the game in the early 2000s. I was a hardcore cladist most of my early graduate career. I thought parsimony was the only reasonable way to infer relationships because it wasn’t a model-based approach like maximum likelihood or Bayesian inference. Those models made too many assumptions I thought and was taught. Alternatively, parsimony wasn’t a model because the foundation of that idea is to “minimize ad hoc assumptions about homoplasy” (i.e., reduce noise in the tree from characters moving around). Using parsimony, the shortest tree with the fewest steps (or evolutionary transitions) is the best tree – period. The other methods were using models to guesstimate from DNA sequences too much about how often an A (adenine) turns to a C (cytosine) or a T (thymine) to a G (guanine). It was crazy how much assuming those crazy-assuming people were doing. If they just did some morphology they would better understand how all this stuff really worked and that there is only one true religion, I mean method, parsimony. We were the Jedi knights that stuck to our principles; those other folks just weren’t thinking it through. Then something happened: I saw the light.
            I realized at one point that there isn’t a right way to study historical relationships. We can’t actually know the truth about who is related to whom when discussing organisms that diverged millions of years ago. We are also using methods that are extremely computationally intensive. They are all models, even the heuristic we use to run parsimony. No computer on Earth can fully resolve a phylogeny of more than a dozen or so species using any heuristic of parsimony or likelihood: there are just too many possible answers. When we study a historical science using morphological characters or DNA we will never be sure we are right. As I started using DNA methods more I realized I wanted to start better understanding when these lineages started to diverge. I needed to put a rough age on a group and to do that I needed to use likelihood and Bayes because only those use evolutionary models from which you can understand how DNA sequences change over time. I slowly found myself using these other methods more and more. Did I still use parsimony, sure sometimes, but it gave me the same answer as those other methods, just less information (e.g., a tree without branch lengths or information about time). The relationships themselves are interesting but I also wanted to know about evolution and biogeography beyond the tree.
            Now I’m still a cladist, and I hope I can count many friends among those in the Willi Hennig Society. (I named my dog Willi.) The folks in that society helped me think more clearly about methods and the philosophy of systematics, and also about the limits of what we can know in general (epistemology). They have invited me for talks at their annual Hennig meetings and I always learn a lot at this conference. Many people are intimidated by these meetings because many senior members do yell at each other, but they are friends in the end - trying to improve each others work. They do sometimes pick on folks that aren’t their friends, and that isn’t cool. You do have to bring your “A” game to Hennig because there are no concurrent sessions and there is an unlimited time for questions. You always have to explain why you picked a certain method over another, it isn’t about using the newest method it is about justifying your choice. (Much like the Cladistics editorial was trying to say. I think.) Compared to other meetings where there is often few, if any, questions - even after a terrible talk - I actually think Hennig is doing it right. They have many fewer members than other major systematic societies so they have the luxury of having just one session at a time and an open-ended question period. The Hennig conference is also strongly skewed male, which is a problem they really need to fix. Many senior members of the society need to tone it down a bit too. They can be crass and pedantic and use jargon as a weapon to make semantic arguments over relatively mundane things (“how can you test a model with a model”; “is there such a thing as an order-quantifiable metric of similarity”). I still publish in Cladistics (as recently as last year) and it even had a Bayesian analysis in it. Although I let my membership lapse a few years ago I’m not opposed to going to another meeting in the future. I think the editorial they published is a step backwards only because it sounds so uninviting: “If alternative methods give different results and the author prefers an unparsimonious topology, he or she is welcome to present that result, but should be prepared to defend it on philosophical grounds.” Many read that as, “You can submit non-parsimony things but you need to explain why, and even if you explain why, we still might not like it because parsimony.”
            I think the editorial was a mistake because it sounded like they will only accept the parsimony answer if you get alternatives from other sources. And that makes the journal “Hardcore Cladistics” and it was, at least recently, just “Cladistics.” I do hope they reconsider their stance, or at least clarify. I still consider Cladistics a great journal, one that I enjoy reading because of its organismal focus on systematics. I haven’t had issues with editors or reviewers telling me I need to do a parsimony analysis or remove a likelihood or Bayesian analysis, but I’ve heard that other may have. Time will tell if the journal and the society can right the ship, unfortunately, it was a storm of their own creation that has it teetering.

Tuesday, January 26, 2016

Learning “R” in Spain


Studying turtles with R. Julien Claude in the background.
The sun rising from Montserrat.
From January 17-23 my new PhD student A.J. Turner and I went to a small town near Barcelona in Cataluña, Spain. We were there to take a morphometrics course in R (more info here:Transmitting Science) . For the uninitiated, R is a programming language and environment that can used to manipulate data, conduct analyses, and make beautiful figures - among other things. We would like to use R to measure and compare shapes of various fish species to better understand how body shapes change over the life of an organism; how these shapes evolve among/between groups; and how to use information about shape to better understand the changing forms of of fishes over time. A.J. is quite clever and smart and he will one day be able to use this tool to make his cutting-edge dissertation even more cutting-edge. I was once clever and smart too but I’ve felt a little dumb post-tenure. I saw this class as an opportunity to retool, and to reshape (pun intended) some old projects and to think of new ones. It didn’t hurt that the course took place in beautiful Spain. The course was taught by Julien Claude, who is the author of a book “Morphometrics in R,” and he also wrote an R package called “ape” that has been cited thousands of times. There were about twenty other students from around the world there, some were studying shapes of dinosaur bones, or fruits, or flowers - among countless other projects. Almost all brought data to play with and manipulate. From 9am to 7pm for five days we were on our computers going through dozens of examples and exercises. It was rather intense, especially for me – not having been a student since I got my PhD almost 10 years ago. Except for some short breaks and meals, we were engrossed in R all day. The group of students and instructors were a disparate mix of international students, postdocs and PIs. Luckily everyone was very nice and A.J. and I ended up with twenty or so new friends and maybe some future co-authors. I particularly liked Julien. He and I share a rather silly and nerdy sense of humor. A running joke about one of the students being from the future and taking this class to destroy R like the Terminator had us giggling for days for some strange reason. (It might be that writing ten hours of computer code a day makes almost everything else hilarious.) Speaking of bad jokes: Do you know the favorite coding language of pirates? … R! ) By Day 4 my brain was full and I needed to take a bit of a break from the dark classroom and spend some time outdoors. A.J. and I got up at 5am and took a cab to the top of beautiful Montserrat and watched the sun rise over Cataluña. A.J. and I found ourselves walking around the grounds of a rather breathtaking Basilica at the top of Montserrat. There were monks chanting, bells ringing, and beautiful rows of multicolored candles lit for prayer. The sun rising over the mountains was stunningly beautiful as were the paintings and décor inside the monastery. The monastery has a famous dark skinned Virgin Mary statue that reminded us of this part of Spain’s rich African history. Cataluña houses an interesting mix of cultures, something that is notably distinct from the rest of Spain. We were often greeted with ‘Bom dia” in the morning (similar to the Portuguese “Bon dia”), and with “merci” in place of “gracias.” But alas we only had time to learn one new language, and we were back learning the grammar and culture of R in our classroom that same morning. In R we say hello like this setwd("/Users/Prosanta/Desktop/MorphometricswithR2016/datasets"). Our visit to Montserrat was just a few hours but luckily we also had a few free hours when we landed in Barcelona. My postdoc Fernando Alda is from Spain and I wouldn’t have been able to look him in the eyes if I didn’t tell him we saw at least some sites while we were in his home country. Luckily we were able to also see the Sagrada Familia on our way to the course on the day we landed. The Sagrada Familia is the infamously beautiful/hideous giant church designed by Spain’s most influential architect Antoni Gaudi. Inspired by biological shapes (apparently all biological shapes all at once), Gaudi initiated construction of this building in 1882 and it won’t be finished until 2020 (maybe). The building is impossible to describe with words, but let’s just say I don’t think Gaudi would have been good at R. Although I think our instructor Julien Claude can probably make anyone good at R.
               Julien is a patient and kind instructor who made sure every student was getting the current set of skills being taught before moving on; and he also understood that we each had different goals, projects, and kinds of data. For me learning elegant new tests of hypotheses for modularity (the independent changing of shape in one body part versus another) or fluctuating asymmetry (the unbalanced growth across a body’s axis of symmetry) were worth the price of admission. I already have new projects in mind and hope to help some students learn new morphometric techniques. A.J. and I are extremely grateful to our Department of Biological Sciences and Office of Research and Economic Development for the opportunity to attend the R class in morphometrics. 





A.J.Turner at La Sagrada Familia





Wednesday, November 4, 2015

Survey Your Society to Gather Demographic Data

If you have been to a scientific conference and looked around a bit you see students, postdocs, faculty and other professionals. We care mostly about the scientific abilities of these folks - how well they present their findings, the significance of their work, the ambition of the young, and the impact of the senior members. But we should also care about the demographic make-up of the members of these academic societies. (If I have to explain why diversity is good stop reading here.) Each society should know: What is the ratio of the sexes: 50/50? How may folks are internationals/locals? Are there members with disabilities? What about the make-up of different races and ethnicities? Does your academic society look like the general population? Does it even look like your academic institution?


I would like each member of a society to ask their governing body to send out a simple demographic survey to all its members to gather these data anonymously. The survey below is crude and oversimplified and based on the one from the National Science Foundation, but it is better than nothing. Keeping this anonymous ensures that no members should feel uncomfortable revealing this information. The results of the survey should be presented as simple pie charts of the metadata presented on a groups public website.
 

So why do this survey? For starters you can learn how well your society is doing recruiting and retaining a diverse membership? You won't know without a baseline survey. Doing the survey annually will tell you if you have a problem with retention and recruiting. It can help you improve your groups diversity. Does your society have few female members - have you thought about having more female members as part of the governing body, balancing the gender ratio of invited speakers, and perhaps having some parental care options for young parents attending your conferences? Does your society have few African Americans - have you thought of sending some members to recruit and visit at HBCUs?

I hope that every scientific society starts keeping track of this kind of information. We can compare across groups that way. Those comparisons will help us know if there is a general problem across academic societies, or if it is just an issue in some sub-disciplines.

As the Chair of Diversity Committee in my college I know we try to get a diverse pool of candidates to apply to open positions in my university. Part of the way we do that is by contacting groups with a diverse membership. If your scientific society lacks a diverse membership, you won't be helping with our goal. A simple survey like the one below can help you identify potential issues and help you begin the process of trying to solve them.

Please try to convince your academic societies that this survey of membership demographics is important.


EXAMPLE SURVEY 
send out an email to members and then send out an anonymous survey monkey questionnaire.

Dear XXXX Members,
We want to collect diversity data from our membership.  You will get an invitation to participate in a survey monkey questionaire from zzz@yyy.com
Please complete the survey by xx/xx/xx.

As a society we want to be aware of our ability to recruit a diverse group of scholars from different backgrounds, this survey will help us better understand how well we reflect the general population and compare to other scientific and academic societies and organizations. Please help by taking a few minutes to fill out this survey.


ETHNICITY (choose one)

_____Hispanic or Latino

_____Not Hispanic or Latino

_____Do not wish to Provide



RACE (choose one or more)

_____American Indian or Alaskan Native

_____Asian

_____Black or African Amerian

_____Native Hawaiian or Other Pacific Islanders

_____White
_____Other (describe)

_____Do not wish to Provide



DIABILITY (choose one or more)

_____Hearing Impairment

_____Visual Impairment

_____Mobility/Orthopedic Impairment

_____Other (describe)

_____None

_____Do not wish to Provide



GENDER IDENTITY(choose one or more)*
_____Female
_____Male
_____Transgender
_____Other (describe)
_____Do not wish to provide

And then adding

SEXUAL ORIENTATION (choose one or more)*
_____Straight
_____Lesbian
_____Gay
_____Bisexual
_____Queer
_____Other (describe)
_____Do not wish to provide



*UPDATED - thanks to Jeremy Yoder (@JBYoder) and Allison Mattheis for providing these categories from their Queer in STEM survey

Monday, March 2, 2015

So you want a recommendation letter…


‘Tis the season for writing recommendation letters for medical and dental school applicants. Many of these requests are from undergraduates who took my large (nearly 100 student) Evolution course. Unfortunately, I don’t have the time to interview all of the students individually, and I usually only get to know a handful of students well enough to write a proper letter. I typically reply to a request for a letter with a request for more information. I ask the students for the following:

(1) What is your overall GPA?
(2) Why do you want to go to Dental/Med School?
(3) Where did you grow up?
(4) What were the topics of your assignments in my class? 
(5) What was your final grade (numerical) in the class?
(6) Why did you choose LSU?
(7) What volunteer opportunities have you taken advantage of as an undergrad?
(8) Do you have research/internship experience?
(9) Please send along a CV/resume if you have one.
(10) Please provide any additional information you think would help me write your letter…

Once I get the answers to those 10 questions/inquiries I usually have plenty of information to write a more personal and useful recommendation.  Question 1, overall GPA, usually gives me a clue what the chances are that this student will actually get into medical or dental school. Q2, tells me why they want to go to one of these schools – if they don’t have a good answer to why they want to be a doctor - they are unlikely to become one. Questions 3-6 basically tell me (a) are they truthful (because I already know their grades and assignment scores) and (b) their level of ambition and undergrad background. Question 7 and 8 tell me if they are just trying to do well in classes or if they actually tried to accomplish something outside of class. Why would you come to an R1 (Research 1) university and not try to work in one of your professor’s labs? If you haven’t done any research or volunteer work then all you have are your grades, and that isn’t enough. Those students with lots of volunteer hours or research experience have taken advantage of their time as a student and are the most likely to succeed. Q9 and Q10 help me round out the letter and make it as personal as possible.
            Not only do these questions make me write the best letter possible for the students, it also helps me write the letter more easily. Rather than struggling to remember how the student stood out in my class, I can have more direct answers that tell me what kind of person they are and how they compare to my other students (because they all answered the same questions). Also with these answers I can plug in big chunks of text into a letter already formatted for medical and dental school applications. Most professors are modifying the same letter over and over again (we often get dozens of requests a year), at least with these questions I can still make my “standard" letter pretty specific to the individual student.

Thursday, January 29, 2015

On Academic Peaks


I like to use the metaphor of peaking when talking about highly productive times at different stages of your academic career. Think of these peaks as the high-water marks (i.e., year your most high profile papers came out any you get a new grant). These peaks are preceded by periods of high data gathering and much writing; and followed by periods of transition, where new methods are being learned, and the finishing touches are put on loose ends of major projects. I think these peaks should come every three to five years and they are important milestones in your academic career. The first major peak should be around the 3rd year of your PhD program, another sometime during your postdoc, and your highest peak should be during the midpoint of your time as an assistant professor (3rd year pre-tenure or so). There are other peaks (e.g. just before going up for associate and full professor) but let’s talk about the three majors ones in more detail.
            When you are starting off in grad school (let’s assume a PhD program), you want to be a sponge learning new techniques and gathering data over the first couple of years. As you learn to write during these years it usually takes until at least your 3rd year until the publications from those early works start coming out. That’s a good thing because that’s usually the time you go up for your qualifying exams (to be a PhD candidate in good standing). The students that have some pubs coming around this time are usually on the fast track. Those pubs will be some thesis chapters, but also collaborative side projects with others. Once you reach this peak, the thesis committee usually is okay with passing you for these qualifying exams making you a PhD candidate. After that peak, students typically focus on the meatier sections of their thesis and getting them ready for publication.
            Another peak should come at some point during a postdoc. You’ve learned some new skills as a graduate student, and those techniques will make you marketable to others. After you get a postdoc, you won’t need to worry about the constraints you had in graduate school like classes, or friendships (just kidding here, but usually postdocs are kind of in limbo in their new short-term work environment, so friends are harder to come by for sure). Without these constraints you should hit the ground running and publish like mad, collaborating with your new lab, finishing up old projects, getting your last thesis chapters published. This peak should push you out onto the job market.
            So you got a job, time to finish your own personal Mount Everest climb of academic peaks. As you get your new desk tidy, turn on your new computer for the first time, and figure out how to order everything from pens to major lab equipment, you should also be setting yourself up for big peak around the midpoint of your time before you go up for tenure (again around the 3rd year). Your pubs from your postdoc should be coming out (always include your new and old address on these pubs) but also the new cool things you started on at your new position; those things you always wanted to try but didn’t have the independence to attempt. It is all those ideas you put together for your “future plans” slides in your job talk that are coming to fruition. During your first and second year of your job you should have a good bit of start-up to spend and hopefully you have been applying for grants at this time. If you get that grant before your start-up runs out you are in good shape. This time should be the most productive period of your career, you still have postdoc skills, but you also have your own lab, and those people are being productive as well feeding off your ideas and plans.
            These peaks aren’t set in stone at these different time periods but I like to think of them as goals you are trying to reach. Of course you can have a brilliant career peaking at very different times but you don’t want to have your highest peak as a postdoc, or in grad school and Peter out at your new job. And of course these recommendations are just based on personal observations of people’s work and career paths that I’ve generalized here. Sometimes new graduate students get impatient and discouraged about the pace at which publications are coming out, so I always tell them it is important to be patient and that it really isn’t until their 3rd year that we expect them to really be getting those pubs coming out at a regular clip. Likewise, a postdoc with no pubs for a few years certainly isn’t peaking, and almost certainly isn’t getting a tenure-track job anytime soon. A new faculty member in his/her 4th year without a grant and with few pubs might have quite a few things come out in Year 5 but by then the voting faculty will already be thinking of that person with whispers of them not having the stuff to get tenure. That person might still get tenure with the last minute drive but that late peek will be remembered and sometimes considered a negative (e.g. they might say, "This person couldn’t get their stuff together in time for their 3rd year pre-tenure review. Will they be a good scientist with tenure?"). So yes these peaks are generalizations but they are good things to keep in mind as you move up the academic landscape with all its peaks and valleys.

Wednesday, July 16, 2014

A Terrible Cover of Science Magazine & An Example of The Power of Today's Social Media


I checked my mailbox this afternoon, and looked at the cover of my Science magazine and thought: “Whoa, this should have been in a brown paper covering like they had for dirty magazines.” I saw an image on the cover of provocatively clothed women, the title being “Staying a Step Ahead of HIV/AIDS.” Why did this picture need to be on the cover for that story I thought? The image made me think that Science was trying to be incendiary, hip or edgy or something. I put the magazine and my thoughts about it aside; then just before I was about to have a meeting in my office, I decided I needed to put the journal face down and out of view.

I tweeted the joke about the brown paper covering but then decided the cover was still bugging me and tweeted. “When we said we wanted more women in Science this is not what we meant.” and tagged @AAASmember

I thought that would be the end of it, but then Jim Austin (@SciCareerEditor) the Editor of Science Careers (from Science) quoted my tweet and sarcastically replied “Good one.” Shortly after he tweeted, “Am I the only one who finds moral indignation really boring?” I was off Twitter, but Jacquelyn Gill (@JacquelynGill) and others were luckily paying attention. They called him out and the twitterverse went after him and the stupidity of that cover pretty strongly. A few hours later we got a response from Marcia McNutt (@Marcia4Science) Editor-in-Chief of Science magazine, “From us at Science, we apologize to those offended by recent cover. Intent was to highlight solutions to HIV, and it badly missed the mark.”

It was nice of them to apologize especially after Jim Austin’s comments. Read more about the entire exchange from the tweets I highlighted on Storify https://storify.com/LSU_FISH/sexist-and-transphobic-cover-of-science

The cover is still up on-line, and the magazine is still on my desk face down. I was disappointed in Science for publishing a cover that I thought objectified the people in the image, and I was more disappointed by the initial response from Jim Austin. I’m glad the head editor apologized but I was most moved by the quick response from Twitter. In the old days (when they really did put brown paper over dirty magazines) we’d see something like this, maybe shake our fists, maybe even write a letter (like actually write a letter) to the offending party and maybe in 3-6 weeks something would come of it (usually nothing). We might have thought, “well just another example of sexism” and let it slide. Thanks to social media I’m glad at least we all got to vent and share our collective impressions and opinions. I found out I was not alone in being offended, and we all shared a common message that the image was inappropriate. We even got a rapid apology from the editor. And maybe, just maybe, I think the people behind that cover of Science will think twice next time they consider a cover that might be sexist, homophobic, or otherwise just wrong.