How to embody #openscience in your teaching

Bartopen-300x209

Preamble
Recently, I have come to realize that there is a real disconnect in how many professors teach biology and how they do their science. There are numerous open science approaches associated with conducting research including but not limited to the following activities: publishing data, open peer review on peerj, online commentaries on blogs or facebook, using social media such as twitter to keep up to date on research, reviewing meta-data, online notebooks, sharing and downloading code, working on GitHub, and publishing/sharing decks for conference presentations online. However in teaching science, we encourage a closed system of testing, research, and sharing. Tests or essays are generally written and shared only to the teaching assistant or professor, no other products are generated, and we use only the library system or Web of Science to secure and search for the research and ideas we need. I feel that this is too limited. That is not to say that we do not also need those skills, but both some familiarity with the importance of communicating science, funding science, reviewing science, or becoming more web-centric in science is also critical and important as evidence-based and information-science literate citizens.

Ideas
Take any of the above activities that you, your students, your collaborators, or those that you admire in science use to promote open, collaborative, & reproducible science and include in your courses. Use a mix of traditional testing approaches such as a short to long answer test sparingly (but at least once to assure the university you are still doing your job and providing students with those skills the opportunity to test well) with several rigorous outward facing products that students produce (but provide the option to share less widely to respect student choice). This teaches both practical science communication skills and embodies open science in what you teach.

Practice what you preach in what & how you teach. #opensci & #scicomm right in the classroom by pushing work to the web. The ‘what’ and ‘how’ are both important with the ‘what’ elements as student products that are evaluated and the ‘how’ elements as your approach to sharing course materials (data, readings, decks, course information via public web-centric tools) and assigning value and merit to skills in addition to thematic content.

developing-21st-century-critical-thinkers-infographic-mentoring-minds

Testable products I have used in teaching in 2014 that were graded & positively received
1. Generate an infographic that summarizes the quantitative & qualitative findings from deep research on a select student topic.
2. Publish term paper as a Peerj pre-print. Teaching assistant or marker provides feedback using peerj annotation tool. Grades are provided via email individually to each student. Links to all papers provided to other students to review after evaluation. Capitalize on audience effect (i.e. quality of student papers dramatically and non-linearly increases when more than one other person will see the paper and the student is aware of this). Consider providing opportunity to revise to explore versioning & annotation/peer-review.
3. Set up a shared google drive folder for students and teaching assistants to share materials. This is a simple way to avoid email and promote sharing in real time.
4. Use a blog to move discussion in class online and to enable those that prefer to post/ask questions with less risk anonymously. Set open commenting but review before allowing public sharing.
5. Collect data in groups, publish datasets on figshare with meta-data and appropriate  a priori defined tags. Teaching assistants mark datasets weekly on fighshare using a standard rubric.
6. Visit a data repository such as KNB and download a dataset. Read the meta-data and do some basic statistics with the data.  Surprising hard to do and an awesome lesson in discovering the importance of meta-data.
7. Identify an important ecological/enviro issue online, not from Web of Science, and link to human well-being.

These are the ones I have tried to date but slideshare decks, figshare experimental design outlines, and many others are likely viable too. I have also been considering Github, youtube videos by students, and iNaturalist contributions.

oscience

Notes on big data, little data convergences from an ecological perspective.

http://bit.ly/bigdata-littledata seminar I have developed feels very Dr. Seuss.

advice dr seuss gave us

Abstract

Recently, I have been examining Big Data issues for ecology and Little Data issues for athletes. I realized that the challenges & solutions within each domain are very similar. Convergence between ecology & big data and experiments with limited number of athletes and little data significantly overlap. The relative importance of framing the contextual evidence and using appropriate synthesis simplifications suggest that a web-centric, open-science approach to many disciplines of research will promote more effective detection of important factors at both ends of the data spectrum. Connecting ideas is great; connecting data is even better.

Notes to make sense of deck & tweets

concept tweet
I became an ecologist because I love being outside. #outside time is #goodtimes
However, I now recognize that pure field ecology without consider of data, web dissemination, & open science reduces value. outside only, no inside computer time linking data, writing meta-data, bad.
Convergence is the combination of disparate phenomena. Convergence is the combination of disparate phenomena. Field ecology & open data must combine.
Three steps to convergence: create, collect, combine. Scientific convergence steps: create, collect data, combine.
Data establish convergence. Data establish convergence, promote novel connections, & reciprocally accelerate quantification.
Connecting ideas is great, connecting data is better. Connecting #ideas is great, connecting #data is better.
Adventure alignment between collecting & connecting data should be neutral. Adventure alignment between collecting & connecting data should be neutral.
Research scientist with experience with data. Collecting it, sharing it, using it, losing it, needing it, & failing to untangle it. I am an ecologist. Ecology is always about interactions. Biotic-biotic-abiotic and the complexity of those networks. Ecology is about interactions & important for #BigData to also untangle/identify meaning
I work in deserts exploring the importance of just a single shrub species that facilitates or helps other species. We build datasets for all the different players/participants/interactors in an effort understand the importance of interactions in maintaining resilience structure. In ecology, we sometimes need to connect little data to make #BigData
The goal is to build interaction networks, not just foodwebs, and include horizontal interactions to map the complexity of these systems. Building networks is a viable solution to #BigData complexity.
Ecology can help us understand and manage and big data. It is not a big stretch from ecological networks to big data as balls of yarn that you would love to knit together into something useful. Ecology about connecting the dots. Both untangle #BigData & knit together patterns
Big Data are not static, nor isolated to interactions with machines. We embody big data. The web is big data. Big Data want you and already have you. Interacting with #BigData generates more data. We now embody data.
V is for Vampire and Big Data are all about V (and vampires). V is for vampire & #BigData. Volume, Variety, Velocity. Take control of relationship with data (ownership, privacy, download)
Example: Walmart blends uses a ‘social genome’ approach combining public data from the web, social data and proprietary data such as customer purchasing data and contact information Walmart uses a social genome to knit together #BigData for product placement, stocking, and consumer context.
Example: Google flu compares query counts with traditional flu surveillance systems. Interacting with Big Data generates Big Data. Your search for information is information related to you, your ecology, and your ecosystem. #BigData reciprocity
Example: remote sensing provides rich datasets but scale is a challenge.   Here is an example of the most recent innovation of exploring the mechanistic links between climate and the environmental sensitivities of organisms occur through the microclimatic conditions that organisms experience Remote-sensing #BigData provide regional, landscape, and sometimes local context for dynamics
Example: The abundance & distribution of birds, butterflies, mammals, and many other organisms are recorded and mapped by citizen scientists. Global #BigData of abundance & distribution are rapidly growing for many organisms #citizenscience
C is for challenge. We accept the challenge. It is an adventure we cannot avoid. C is for capture, curation, context (meaning) & complexity- analytics (both for many smaller datasets aggregrated or singular larger ones). The adventure is to solve these challenges are multiple scales and for multiple functions from individual to industry to countries to global challenges. C is for challenges in #BigData: capture, curation, context & complexity-analytics. It is an unavoidable adventure we must accept.
For me data are evidence. Material and immaterial. Data at many volumes can illuminate context, connections, or interactions and I see solutions that help me capitalize on opportunity and own the data. #BigData are evidence. Use it to illuminate context, connections, and most importantly interactions in my research and in my life.
CONTEXT It is informative to a limited extent to see where you are in a distribution, landscape, or constellation of points. Context solution: even a single data point in #BigData can be informative.
INTERACTIONS: focus on interactions. Archive & aggregate your datasets. To archive, share but set appropriate permissions & privacy. Interactions solution: focus on schema & aggregation of #BigData
SYNTHESIS: Find and use metrics, indices, or effect size metrics that simplify your big data and allow it to connect other evidence. Synthesis solution: use metrics that estimate/summarize relative change to connect #BigData
The opportunity for context, interactions, & synthesis is only accelerating with 1-3 billion online, smartphones, and the capacity for threaded Big Data. However, you have to own it. Correlation almost always implies correlation, use that to your advantage in #BigData
Correlation almost always implies causation. Use that to your advantage to seek explanations, context, and the real factors that influence the outcome of interest. Smartphones change everything for #BigData with 3 billion online – own your interactions
I challenge you to spend only 1 minute on www.worldmeters.info and not be inspired to seek synthesis. Two profound examples but there are many more using evidence to make the best possible decisions. Data are not everything but can complement positive values & logic. 1 min on www.worldmeters.info gives you a feel for #BigData volume. @nceas & #Cochrane are inspirational solutions
Ecological reasoning predicated upon interactions PLUS big data is a big adventure. Context, interactions, and synthesis are three simple steps or tools that we need to solve not just personal but global challenges to more effectively & healthily live on this planet. #ecology + #BigData = CIS-tem (context, interactions, synthesis) needed to face global challenges & live better. Use evidence to decide.
Metascience & scientometrics are two important research domains evolving in ecology & other disciplines
Explore the capacity for research products, primarily peer-reviewed publications, to connect to one another.
Structural equation models & response surface methodologies are becoming increasingly common in ecology.
The internet of things and micro-instrumentation with loggers is transforming mechanistic ecological research.
The inherent value of data as an independent, valid research output is increasing.
DataCite is working to promote effective metadata schema & standards for publishing datasets. datacite is working to promote effective metadata schema & standards for publishing datasets.
Novel evidence datastreams that align with #BigData on the web is common in the natural sciences now too. Novel evidence datastreams that align with #BigData on the web is common in the natural sciences now too.
Sharing code is also an important advance in effective data discoveries in ecology and many disciplines. Sharing code is also an important advance in effective data discoveries in ecology and many disciplines.
Big Data, Little Data challenges & solutions converge. Big Data, Little Data challenges & solutions converge.
It is unlikely that too much running kills but does illustrate the importance of context in datasets. It is unlikely that too much running kills but does illustrate the importance of context in datasets.
Little datasets are not necessarily simple. Can be deep but not wide and a challenge to handle. Little datasets are not necessarily simple.
Little data challenges include contrast, representativeness, & power. Little data challenges include contrast, representativeness, & power.
Solutions include pre-post contracts, effect sizes, contrasts to data landscape data. Solutions include pre-post contracts, effect sizes, contrasts to data landscape data.
Representativeness can be explored by changin scales or sensitivity analyses. Representativeness can be explored by changin scales or sensitivity analyses.
Use power to design pilot experiments and explore realistic expectations. Use power to design pilot experiments and explore realistic expectations.
Big Data, Little Data issues thus convergences through framing & synthesis simplifications. Big Data, Little Data issues thus convergences through framing & synthesis simplifications.
Shut down computer, go outside BUT capitalize on data convergences to maximize collection to connection value. Shut down computer, go outside BUT capitalize on data convergences to maximize collection to connection value.
Web-centric ecology should embrabce open-science research objects in all forms. Web-centric ecology should embrabce open-science research objects in all forms.
Meta-data, interactions, and novel data streams in ecology is an emerging opportunity. Meta-data, interactions, and novel data streams in ecology is an emerging opportunity.

AI & you: know thyself

hal-9000

AI can help you empathize through an avatar, describe art, generate art, write news reports for sports, trade stocks, fly airplanes, drive cars, beat you in games, diagnose cancer, and identify patterns in messy big data.  What is next? Can it describe you? Not just what you do but what you need and value.

No surprise, yes. The personality insight service of Watson was nearly perfect (i.e. accurate in describing me, sadly, based on my feedback from other humans) and highly precise (i.e. replicable) in my repeat trials (x5 of causal text I wrote @ approx 1000 words per instance) of the free service.

personality analysis

I am grappling with the implications. Should disciplines of scientists do it to explore consilience and common trends? Should my students do it so that I can tailor teaching more directly to their needs or sets of needs?  Are those that collaborate more in ecology more likely to be open? Likely yes on all these so should we start using this tools in science to build teams and working groups. Maybe…

Prescriptive not predictive is likely a more parsimonious approach to use of AI for personality analyses.
Know thyself through these tools to identify your limitations, opportunities for change and focus, and to remind yourself that you are are neither static nor defined simply by others. You are a complex ecosystem, like all natural systems, with multiple dimensions.

Posted in fun