Celebrate #ESA100 & promote #openscience in ecology though synthesis by publishing your synthesis datasets. #ecosynthesis

100-beans

Preamble
Summarizing 100 years of ecology and looking forward should incorporate formal synthesis tools. In the spirit of promoting these efforts, for better or worse, I pulled together all the synthesis datasets I have collaborated in building and published any outstanding ones online this week.

I discovered the meta-data we keep for our derived datasets is ‘less than optimal’, that there are some similarities across synthesis datasets (particularly meta-analyses), and that as a rule of thumb figshare or oneshare are great spots for these type of data.  I realize that primary data on ecological systems absolutely needs formal meta-data and should be published in repositories with structured meta-data such as knb, but derived data can still likely have utility in other repositories. Gigascience and Scientific Data are also great homes for more complete data packages.

Shuffling-Ideas2

gif from murally

Published, representative synthesis datasets
Here are all synthesis datasets published to date.  I have only one left to dig up, clean up, and formalize before publication.

A meta-analytic dataset of plant facilitation in coastal dune systems: responses, regions, and research gaps.

Tree invasions dataset: a comparative test of the dominant hypotheses and functional traits

A meta-analysis of the ecological significance of density in tree invasion

The summary data for a review of the relationship with pollen limitation of plant reproduction

Dataset for the diversity of diversity studies: retrospectives and future directions

The relative success of studies of plant facilitation 2009

The dataset for a systematic review of the attractant-decoy and repellent- plant hypotheses: do plants with heterospecific neighbours escape herbivory?

Dataset examining functional assessment of animals with plant facilitation complexes

Dataset for A systematic review and conceptual framework for the mechanistic pathways of nurse plants

Dataset for Land management trumps the effects of climate change and elevated CO2 on grassland functioning

A systematic review of the ecological literature on cushion plants

A systematic review of arthropod community diversity in association with invasive plants

Indirect interactions in terrestrial plant communities: emerging patterns and research gaps Dataset

Oppportunity
As a community, I would love to see the other synthesis datasets out there too. I have found quite a few but they are often in the form of online supplements associated with standard publications. There could be some really neat connections across meta-analyses between conservation, ecology, and different taxa.

If you have derived, synthesis datasets published (and done all that work to aggregate independent data), please publish then share them with the tag #ecosynthesis. If you do it leading up the ESA annual meeting, use the tag #ESA100 too and folks can explore them at the meeting!

25a4ac_6dce95344e184489a7b817e3a67bec49.jpg_srz_979_382_85_22_0.50_1.20_0.00_jpg_srz

How to embody #openscience in your teaching

Bartopen-300x209

Preamble
Recently, I have come to realize that there is a real disconnect in how many professors teach biology and how they do their science. There are numerous open science approaches associated with conducting research including but not limited to the following activities: publishing data, open peer review on peerj, online commentaries on blogs or facebook, using social media such as twitter to keep up to date on research, reviewing meta-data, online notebooks, sharing and downloading code, working on GitHub, and publishing/sharing decks for conference presentations online. However in teaching science, we encourage a closed system of testing, research, and sharing. Tests or essays are generally written and shared only to the teaching assistant or professor, no other products are generated, and we use only the library system or Web of Science to secure and search for the research and ideas we need. I feel that this is too limited. That is not to say that we do not also need those skills, but both some familiarity with the importance of communicating science, funding science, reviewing science, or becoming more web-centric in science is also critical and important as evidence-based and information-science literate citizens.

Ideas
Take any of the above activities that you, your students, your collaborators, or those that you admire in science use to promote open, collaborative, & reproducible science and include in your courses. Use a mix of traditional testing approaches such as a short to long answer test sparingly (but at least once to assure the university you are still doing your job and providing students with those skills the opportunity to test well) with several rigorous outward facing products that students produce (but provide the option to share less widely to respect student choice). This teaches both practical science communication skills and embodies open science in what you teach.

Practice what you preach in what & how you teach. #opensci & #scicomm right in the classroom by pushing work to the web. The ‘what’ and ‘how’ are both important with the ‘what’ elements as student products that are evaluated and the ‘how’ elements as your approach to sharing course materials (data, readings, decks, course information via public web-centric tools) and assigning value and merit to skills in addition to thematic content.

developing-21st-century-critical-thinkers-infographic-mentoring-minds

Testable products I have used in teaching in 2014 that were graded & positively received
1. Generate an infographic that summarizes the quantitative & qualitative findings from deep research on a select student topic.
2. Publish term paper as a Peerj pre-print. Teaching assistant or marker provides feedback using peerj annotation tool. Grades are provided via email individually to each student. Links to all papers provided to other students to review after evaluation. Capitalize on audience effect (i.e. quality of student papers dramatically and non-linearly increases when more than one other person will see the paper and the student is aware of this). Consider providing opportunity to revise to explore versioning & annotation/peer-review.
3. Set up a shared google drive folder for students and teaching assistants to share materials. This is a simple way to avoid email and promote sharing in real time.
4. Use a blog to move discussion in class online and to enable those that prefer to post/ask questions with less risk anonymously. Set open commenting but review before allowing public sharing.
5. Collect data in groups, publish datasets on figshare with meta-data and appropriate  a priori defined tags. Teaching assistants mark datasets weekly on fighshare using a standard rubric.
6. Visit a data repository such as KNB and download a dataset. Read the meta-data and do some basic statistics with the data.  Surprising hard to do and an awesome lesson in discovering the importance of meta-data.
7. Identify an important ecological/enviro issue online, not from Web of Science, and link to human well-being.

These are the ones I have tried to date but slideshare decks, figshare experimental design outlines, and many others are likely viable too. I have also been considering Github, youtube videos by students, and iNaturalist contributions.

oscience

Notes on big data, little data convergences from an ecological perspective.

http://bit.ly/bigdata-littledata seminar I have developed feels very Dr. Seuss.

advice dr seuss gave us

Abstract

Recently, I have been examining Big Data issues for ecology and Little Data issues for athletes. I realized that the challenges & solutions within each domain are very similar. Convergence between ecology & big data and experiments with limited number of athletes and little data significantly overlap. The relative importance of framing the contextual evidence and using appropriate synthesis simplifications suggest that a web-centric, open-science approach to many disciplines of research will promote more effective detection of important factors at both ends of the data spectrum. Connecting ideas is great; connecting data is even better.

Notes to make sense of deck & tweets

concept tweet
I became an ecologist because I love being outside. #outside time is #goodtimes
However, I now recognize that pure field ecology without consider of data, web dissemination, & open science reduces value. outside only, no inside computer time linking data, writing meta-data, bad.
Convergence is the combination of disparate phenomena. Convergence is the combination of disparate phenomena. Field ecology & open data must combine.
Three steps to convergence: create, collect, combine. Scientific convergence steps: create, collect data, combine.
Data establish convergence. Data establish convergence, promote novel connections, & reciprocally accelerate quantification.
Connecting ideas is great, connecting data is better. Connecting #ideas is great, connecting #data is better.
Adventure alignment between collecting & connecting data should be neutral. Adventure alignment between collecting & connecting data should be neutral.
Research scientist with experience with data. Collecting it, sharing it, using it, losing it, needing it, & failing to untangle it. I am an ecologist. Ecology is always about interactions. Biotic-biotic-abiotic and the complexity of those networks. Ecology is about interactions & important for #BigData to also untangle/identify meaning
I work in deserts exploring the importance of just a single shrub species that facilitates or helps other species. We build datasets for all the different players/participants/interactors in an effort understand the importance of interactions in maintaining resilience structure. In ecology, we sometimes need to connect little data to make #BigData
The goal is to build interaction networks, not just foodwebs, and include horizontal interactions to map the complexity of these systems. Building networks is a viable solution to #BigData complexity.
Ecology can help us understand and manage and big data. It is not a big stretch from ecological networks to big data as balls of yarn that you would love to knit together into something useful. Ecology about connecting the dots. Both untangle #BigData & knit together patterns
Big Data are not static, nor isolated to interactions with machines. We embody big data. The web is big data. Big Data want you and already have you. Interacting with #BigData generates more data. We now embody data.
V is for Vampire and Big Data are all about V (and vampires). V is for vampire & #BigData. Volume, Variety, Velocity. Take control of relationship with data (ownership, privacy, download)
Example: Walmart blends uses a ‘social genome’ approach combining public data from the web, social data and proprietary data such as customer purchasing data and contact information Walmart uses a social genome to knit together #BigData for product placement, stocking, and consumer context.
Example: Google flu compares query counts with traditional flu surveillance systems. Interacting with Big Data generates Big Data. Your search for information is information related to you, your ecology, and your ecosystem. #BigData reciprocity
Example: remote sensing provides rich datasets but scale is a challenge.   Here is an example of the most recent innovation of exploring the mechanistic links between climate and the environmental sensitivities of organisms occur through the microclimatic conditions that organisms experience Remote-sensing #BigData provide regional, landscape, and sometimes local context for dynamics
Example: The abundance & distribution of birds, butterflies, mammals, and many other organisms are recorded and mapped by citizen scientists. Global #BigData of abundance & distribution are rapidly growing for many organisms #citizenscience
C is for challenge. We accept the challenge. It is an adventure we cannot avoid. C is for capture, curation, context (meaning) & complexity- analytics (both for many smaller datasets aggregrated or singular larger ones). The adventure is to solve these challenges are multiple scales and for multiple functions from individual to industry to countries to global challenges. C is for challenges in #BigData: capture, curation, context & complexity-analytics. It is an unavoidable adventure we must accept.
For me data are evidence. Material and immaterial. Data at many volumes can illuminate context, connections, or interactions and I see solutions that help me capitalize on opportunity and own the data. #BigData are evidence. Use it to illuminate context, connections, and most importantly interactions in my research and in my life.
CONTEXT It is informative to a limited extent to see where you are in a distribution, landscape, or constellation of points. Context solution: even a single data point in #BigData can be informative.
INTERACTIONS: focus on interactions. Archive & aggregate your datasets. To archive, share but set appropriate permissions & privacy. Interactions solution: focus on schema & aggregation of #BigData
SYNTHESIS: Find and use metrics, indices, or effect size metrics that simplify your big data and allow it to connect other evidence. Synthesis solution: use metrics that estimate/summarize relative change to connect #BigData
The opportunity for context, interactions, & synthesis is only accelerating with 1-3 billion online, smartphones, and the capacity for threaded Big Data. However, you have to own it. Correlation almost always implies correlation, use that to your advantage in #BigData
Correlation almost always implies causation. Use that to your advantage to seek explanations, context, and the real factors that influence the outcome of interest. Smartphones change everything for #BigData with 3 billion online – own your interactions
I challenge you to spend only 1 minute on www.worldmeters.info and not be inspired to seek synthesis. Two profound examples but there are many more using evidence to make the best possible decisions. Data are not everything but can complement positive values & logic. 1 min on www.worldmeters.info gives you a feel for #BigData volume. @nceas & #Cochrane are inspirational solutions
Ecological reasoning predicated upon interactions PLUS big data is a big adventure. Context, interactions, and synthesis are three simple steps or tools that we need to solve not just personal but global challenges to more effectively & healthily live on this planet. #ecology + #BigData = CIS-tem (context, interactions, synthesis) needed to face global challenges & live better. Use evidence to decide.
Metascience & scientometrics are two important research domains evolving in ecology & other disciplines
Explore the capacity for research products, primarily peer-reviewed publications, to connect to one another.
Structural equation models & response surface methodologies are becoming increasingly common in ecology.
The internet of things and micro-instrumentation with loggers is transforming mechanistic ecological research.
The inherent value of data as an independent, valid research output is increasing.
DataCite is working to promote effective metadata schema & standards for publishing datasets. datacite is working to promote effective metadata schema & standards for publishing datasets.
Novel evidence datastreams that align with #BigData on the web is common in the natural sciences now too. Novel evidence datastreams that align with #BigData on the web is common in the natural sciences now too.
Sharing code is also an important advance in effective data discoveries in ecology and many disciplines. Sharing code is also an important advance in effective data discoveries in ecology and many disciplines.
Big Data, Little Data challenges & solutions converge. Big Data, Little Data challenges & solutions converge.
It is unlikely that too much running kills but does illustrate the importance of context in datasets. It is unlikely that too much running kills but does illustrate the importance of context in datasets.
Little datasets are not necessarily simple. Can be deep but not wide and a challenge to handle. Little datasets are not necessarily simple.
Little data challenges include contrast, representativeness, & power. Little data challenges include contrast, representativeness, & power.
Solutions include pre-post contracts, effect sizes, contrasts to data landscape data. Solutions include pre-post contracts, effect sizes, contrasts to data landscape data.
Representativeness can be explored by changin scales or sensitivity analyses. Representativeness can be explored by changin scales or sensitivity analyses.
Use power to design pilot experiments and explore realistic expectations. Use power to design pilot experiments and explore realistic expectations.
Big Data, Little Data issues thus convergences through framing & synthesis simplifications. Big Data, Little Data issues thus convergences through framing & synthesis simplifications.
Shut down computer, go outside BUT capitalize on data convergences to maximize collection to connection value. Shut down computer, go outside BUT capitalize on data convergences to maximize collection to connection value.
Web-centric ecology should embrabce open-science research objects in all forms. Web-centric ecology should embrabce open-science research objects in all forms.
Meta-data, interactions, and novel data streams in ecology is an emerging opportunity. Meta-data, interactions, and novel data streams in ecology is an emerging opportunity.