Review journals or journals with synthesis format contributions in EEB

Colleagues and I were checking through current journal listings that either explicitly focus on synthesis such as systematic reviews or include a section that is frequently well represented with synthesis contributions. Most journals in ecology, evolution, and environmental science that publish primary standard, research articles nonetheless also offer the opportunity for these papers too, but it can be less frequent or sometimes less likely to accept different forms of synthesis (i.e. systematic reviews in particular versus meta-analyses).

List

Diverse synthesis contributions very frequent
Conservation Letters (Letters)
Perspectives in Science
Perspectives in Plant Ecology, Evolution and Systematics
Diversity & Distributions
Ecology Letters
TREE
Oikos
Biological Reviews
Annual review of ecology, evolution, systematics
Letters to Nature
Frontiers in Ecology and the Environment
PLOS ONE (many systematic reviews)
Environmental Evidence
Biology Letters
Quarterly Review of Biology

Frequent synthesis contributions with some diversity in formats
Global Ecology and Biogeography
Annals of Botany
New Phytologist
Ecography
Ecological Applications
Functional Ecology
Proceedings of the the Royal Society B
Ecology and Evolution

 

 

Rules of thumb for better #openscience and transparent #collaboration

Rules-of-thumb for reuse of data and plots
1. If you use unpublished data from someone else, even if they are done with it, invite them to be a co-author.
2. If you use a published dataset, at the minimum contact authors, and depending on the purpose of the reuse, consider inviting them to become a co-author. Check licensing.
3. If you use plots initiated by another but in a significantly different way/for a novel purpose, invite them to be co-author (within a reasonable timeframe).
4. If you reuse the experimental plots for the exact same purpose, offer the person that set it up ‘right of first refusal’ as first author (within a fair period of time such as 1-2 years, see next rule).
5. If adding the same data to an experiment, first authorship can shift to more recent researchers that do significant work because the purpose shifts from short to long-term ecology.  Prof Turkington (my PhD mentor) used this model for his Kluane plots.  He surveyed for many years and always invited primary researchers to be co-authors but not first.  They often declined after a few years.
6. Set a reasonable authorship embargo to give researchers that have graduated/changed focus of profession a generous chance to be first authors on papers.  This can vary from 8 months to a year or more depending on how critical it is to share the research publicly.  Development pressures, climate change, and extinctions wait for no one sadly.
Rules-of-thumb for collaborative writing
1. Write first draft.
2. Share this draft with all potential first authors so that they can see what they would be joining.
3. Offer co-authorship to everyone that appropriately contributed at this juncture and populate the authorship list as firmly as possible.
4. Potential co-authors are invited to refuse authorship but err on the side of generosity with invitations.
5. Do revisions in serial not parallel.  The story and flow gets unduly challenging for everyone when track changes are layered.

A set of #rstats #AdventureTime themed #openscience slide decks

Purpose

I recently completed a set of data science for biostatistics training exercises for graduate students. I extensively used R for Data Science and Efficient R programming to develop a set of Adventure Time R-statistics slide decks. Whilst I recognize that they are very minimal in terms of text, I hope that the general visual flow can provide a sense of the big picture philosophy that R data science and R statistics offer contemporary scientists.

Slide decks

  1. WhyR? How tidy data, open science, and R align to promote open science practices.
  2. Become a data wrangleR. An introduction to the philosophy, tips, and associated use of dplyr.
  3. Contemporary data viz in R. Philosophy of grammar of graphics, ggplot2, and some simple rules for effective data viz.
  4. Exploratory data analysis and models in R. An explanation of the difference between EDA and model fitting in R. Then, a short preview of how to highlighting modelR.
  5. Efficient statistics in R. A visual summary of the ‘Efficient R Programming’ book ideas including chunk your work, efficient planning, efficient planning, and efficient coding suggestions in R.

Here is the knitted RMarkdown html notes from the course too https://cjlortie.github.io/r.stats/, and all the materials can be downloaded from the associated GitHub repo.

I hope this collection of goodies can be helpful to others.

adventures

 

A review of ‘R for Data Science’ book @hadleywickham #rstats #openscience

Data science

Data science is a critical component of many domains of research including the domain I primarily function – ecology. However, in teaching biostatistics within the university context, we have typically focussed on the statistics and less on the science of data (i.e. handling, understanding, and manipulating data). This is unfortunate, but the teaching landscape is now rapidly evolving to include offerings of numerous institutional Master’s of Data Science degrees.

vizstars

It has taken me an embarrassingly long time to appreciate the differences between data science and statistics. My teaching has embraced open science and shared many of the skills that students need to be scientifically-literate citizens. However, data-literate citizens are important too if we want the next generation to make informed, evidence-based decisions about health, the economy, and the health of our ecosystems. Critical thinking tools for data are non-trivial concepts and statistics are absolutely needed. However, the science of data, big or little, is critical in appreciating the decisions, steps, and workflows needed to prepare, share, analyze, collaborate, and evaluate quantitative and qualitative data. I have been on a reading binge to this effect to both appreciate the value of data science thinking and improve the skill set that I can share with students and some collaborators. Last week, I completed my latest adventure – ‘R for Data Science’ by Garrett Grolemund & Hadley Wickham.

cover

Review

The book was written in R markdown, compiled using bookdown, and it is free online. Appropriately, it thus embodies both open science and data science in how it is written. Bookdown is a package for R that knits a set of R markdown files together into a book. This is important because it is open, you can clone the book from GitHub, it is written using one of the most powerful open science/data science tools, i.e. R (language and environment), and in reading online and seeing the code, you also appreciate the trickle effects of ‘open data science’ thinking to writing, collaboration, and even publishing. This is all incredible, and it is a peek into a very different future of scholarly communication. The book is nearly complete. I read what was available because I teach soon. It confirmed and advanced my understanding and skill set for data science immensely. Here is a brief summary, without spoilers, of some of the dimensions I used to conclude that this book is fantastic.

Language & clarity
In reading R statistics, statistics, or data science books, one expects/hopes that like literate coding, the prose will be accessible, pleasant, and appropriately pitched. This book was ideal in this respect. It was more formal than conversational but not too technical. The structure facilitated comprehension and reading because it was clear and logical. The visuals added a dimension of attractive clarity to the writing that were not just code, prose, R, or data viz. Many of the visuals were excellent heuristics. Some were a reminder to the reader of the big picture in data science whilst others highlighted a particular workflow/approach.

Example of big picture visual.

data-science-explore

Example of mechanistic heuristic.

join-many-to-many

These were extremely useful. I could have even used more here and there, but in digging into the examples, I recognize that they were likely not always needed (and too much can be a bad thing too if poorly executed). The clarity was very high in almost every chapter of the book. I struggled with some of the more complex chapters (for me) such as relational data or some elements of the model building, but the flow keep me rolling through these even if some of the details eluded me.

join-venn

The expectation that data science or statistics books should be only read once is a challenging notion. Many of the chapters in this book certainly satisfy that criterion, but it depends on the purpose. Some of the more challenging chapters that you identify can be re-read for better comprehension and one could also follow along/experiment with in R studio. Sometimes, it is nonetheless good to get the message from alternate sources described or explained a little differently. In my reading R bonanza, some of the R-statistics books will not be revisited. My feeling for R for Data Science is that the clean style and direct writing do not conflate the message and re-reads would likely be beneficial when needed. The message in many chapters is also unique, and even a brief revisit would highlight some of the handling elements and assumptions associated with best practices for data science.

Philosophy
Welcome to the tidyverse. Enough said to all that follow and read up within the R community. This universe is logical and feels natural. The forthcoming ggvis will help further align the grammar and semantics that parallel the code and flow with pipes versus ‘+’ of ggplot2. Tibbles are a pleasant surprise. The wrangle readings satisfy. Tidiness is next to high-orderedness. Subscribing to the philosophy of readable code, consistent data structures, and logical workflows will promote better open science and reproducibility. This is never really explicitly stated, or if it was, I missed it. I suspect that this is a good thing. We can approach open science, open data, and more transparency in science from top-down or bottom-up efforts. By not repeatedly banging that drum per se but directly providing and describing the tools to handle data cleanly and consistently, this book provides a solid bottom-up pillar for the open science movement. Tidy data and readable code are shareable AND useable. Finally and aligned with this tools-first approach, the value of models and epistemology of hypotheses are stated later in the book (Chapter 19). This worked for me in reading this book but likely not in teaching to students. I like the hypothesis/model philosophy of ‘knowing data’ developed here. It was big data in origins, balanced, and emphasized bias and non-independence in exploring and testing models. What you can learn from a model also depends on how it is applied. This was well described. Split. Build. Think. Test. Know.
Your own personal variation would likely fit within a similar framework even with little data. I did wonder a bit how I can adapt some of the model fitting ideas to more of the little data common in some the ecological inquiries (solutions: (i) pilot field experiments can provide the training data, and (ii) resampling/bootstrapping using modelr to populate larger datasets for more independent EDA) . The reminder to avoid repetition is repeated. Not ironically.

Skills
Many books do not need to adapt. Most R statistics books likely do. Packages are often a gamechanger. Grammar changes. Base R is a must know of course, but streamlining and specifics often live in the libraries the community develops. This book is available for sale on amazon, and I assume it will adapt but more slowly than the bookdown version. The frame-rate of change in no way precludes reading the book now or revisiting at some later point in time. Model building chapters, the basics of wrangling, functions, and iterations are solid reading that provide a skill set needed right now. The data viz and perhaps data transformation chapters are most likely to change soon. Read now and capture those skills but expect change. There are also some nice examples of intermediate to advanced tricks in plotting that reading now will provide. Certainly,  this the case in the iteration and model chapters too – good intermediate skill building blocks for advanced coding data science. This skill set is pretty darn awesome (PDA), and the strings chapter was also very rich in news skills and a launchpad to text mining with other packages (inspired me to try it right after completion of reading book). Skills abound.

Bottom line (of code) review for readers

high.returns <- c(“basic.R.users”, “intermediate.R.users”)

tidy.data.science <- philosophy of consistent structures %>% visualize with models %>% share

Implication
There are many tools for open science (data management plans, slideshare, data repositories, GitHub, preprints, sharing meta-data, social media, blogs, and data publications) . However, effective date science in R can also be a powerful ally if you include the final steps of communicate (Chapters 23-25).

datascience-openscience

 

Posted in R

The importance of #upgoESA experiment by @DrHolly #ESA2016

The ‘Up-Goer Five Challenge: Using Common Language to Communicate Your Science to the Public‘ session was an experiment.  It was a brilliant success. Enjoyable and profound because of the direct and indirect discoveries in how we communicate and share. Semantics are important. Scientific language conveys complexity. Complexity can become a barrier. Simpler language tends to highlight emotions. Using simpler words can change meaning but make the narrative more powerful.  The main direct discovery was that we function, as scientists and communicators, on a continuum from jargon to overly simple, and we need to find the sweet spot in using complexity appropriately in sharing our findings with others (and one another).

EAS NCRG Complexity

However, I propose the ‘experiment’ need not have been successful for us to learn. Experiments are about discovery. We learn as much from error as success in science. Trials are useful. The most exciting element of the up goer five model for talks was the fact that Dr. Holly Menninger proposed the session, it got approved, and many people participated (in speaking, attending, and the discussion). We need to try things out. We need to experiment with scientific communication just like we experiment with research systems and test hypotheses and predictions. There is a field of research in communication studies, and I am not proposing we must also become experts in that too. However, ESA meetings are a safe place for ecologists.  At the minimum, we can try some new things in how we communicate with one another and explore efficacy and potential for different audiences. There is likely no one best way for every context. Importantly, we can practice taking risks. Each of us needs to decide what we are comfortable with. Oral session, poster, or ignite for instance each come with different risks and challenges. The upgoESA model provided an alternative opportunity that came with new risks. However, we benefitted from the experiment and made some discoveries. Consequently, I propose we continue to look outward like Dr. Holly Menninger did and continue to bring new opportunities to future ESA meetings that explore how communicate. PechaKucha, slide karoke, video abstracts, streaming, micro-writing groups, hackathons, datashareathons, and more meetups are all viable experiments too. The session ‘Ecology on the Runway: An Eco-Fashion Show and Other Non-Traditional Public Engagement Approaches‘ was also an experiment with risks, entertainment, and a different set of messages.

We need to continue to hack the conference model and treat it like our own collective experiment to become better communicators. Plus, experiments are fun.

hack-class-shape-your-ecology-empower-learning-at-sxswedu-2013-10-638

Posted in fun