Luis J. Villanueva-Rivera, from right, Bryan Pijanowski and Sarah Dumyahn collect data from a remote listening post that records sounds from the surrounding area. (Purdue Agricultural Communication photo/Tom Campbell)
WEST LAFAYETTE, Ind. – A Purdue University researcher is leading an effort to create a new scientific field that will use sound as a way to understand the ecological characteristics of a landscape and to reconnect people with the importance of natural sounds.
Soundscape ecology, as it’s being called, will focus on what sounds can tell people about an area. Bryan Pijanowski, an associate professor of forestry and natural resources and lead author of a paper outlining the field in the journal BioScience, said natural sound could be used like a canary in a coal mine. Sound could be a critical first indicator of environmental changes.
Pijanowski said sound could be used to detect early changes in climate, weather patterns, the presence of pollution or other alterations to a landscape.
“The dawn and dusk choruses of birds are very characteristic of a location. If the intensity or patterns of these choruses change, there is likely something causing that change,” Pijanowski said. “Ecologists have ignored how sound that emanates from an area can help determine what’s happening to the ecosystem.” Continue reading
The 11 February issue of Science was a Special Issue that contained a section on data. Seems to be an interesting overview, with perspectives from climate research, ecology, and other areas. The most troubling figure, from the introduction, was:
Just one of each five researchers have funding for data curation. This is a recipe for wasted efforts, increased redundancy, and wasted research opportunities.
The online issue’s table of contents is here.
Amazon’s EC2 service offers an interesting diversity of cloud computing. From less than a dollar an hour, you can run a powerful virtual machine on their hardware. View my previous post on setting up an Ubuntu machine with R.
I was interested in playing some with this infrastructure, and now installing a graphical interface, GNOME, on a Ubuntu machine. After setting up an Ubuntu 10.10 machine from these instructions, set up remote access with these instructions: Continue reading
Amazon EC2 is their “cloud” service, which means that you can run a virtual machine on their hardware. They have many basic VMs, which they call AMI, that you can use to start and setup your machine with the configuration and software that you need.
New accounts, since October 2010, on Amazon Web Services can have a year of free services of the basic varieties of their services. For example, they offer 5GB on the S3 storage and a “micro” virtual machine on their EC2 platform. The micro has a single Xeon E5430 2.66GHz CPU, 613 MB of RAM and 8GB of disk space. It is not much, but you can use it to play and learn to use EC2. You can also setup a machine configuration and then save it as an image (AMI) to create more powerful machines from that configuration. Continue reading
I just found this column in Nature discussing the need for scientists to publish the code they used. I’m still amazed that this is not getting more attention. If a proper-written Methods section is mandatory in a paper, why not the code that produced the results?
Among the excuses for not publishing the code (and the reasons why they are not valid most of the time) that the column identifies are:
- It is not common practice.
- People will pick holes and demand support and bug fixes.
- The code is valuable intellectual property that belongs to my institution.
- It is too much work to polish the code.
Even when the Methods section may be enough to be able to reproduce your results, why condemn someone else to go through the whole process of debugging some code to make it work? The code may not be pretty, may require some obscure software, or may be more convoluted than it has to be, but it is incredibly valuable and a time-saver for researchers. This kind of code is also very useful for students, it lets them learn how research is done in their area.
Of course, this requires some planning and careful file management, hopefully universities and societies will start promoting code publishing to make this practice more common.
Barnes, Nick. 2010. Publish your computer code: it is good enough. Nature 467: 753. doi:10.1038/467753a
I have just posted online the data used for the paper: Acevedo, M. A. and L. J. Villanueva-Rivera. 2006. Using automated digital recording systems as effective tools for the monitoring of birds and amphibians. Wildlife Society Bulletin 34:211-214.
This data has been released under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License licence by the authors and can be used for educational and research purposes. I’ll appreciate if you let me know when and how you use this data.
Any commercial use is prohibited without the written authorization of the authors.
From the blog seeing data, Chris McDowall publishes a Zen of Open Data based on the Zen of Python (a programming language):
The Zen of Open Data
Open is better than closed.
Transparent is better than opaque.
Simple is better than complex.
Accessible is better than inaccessible.
Sharing is better than hoarding.
Linked is more useful than isolated.
Fine grained is preferable to aggregated.
(Although there are legitimate privacy and security limitations.)
Optimise for machine readability — they can translate for humans.
Barriers prevent worthwhile things from happening.
“Flawed, but out there” is a million times better than “perfect, but unattainable”.
Opening data up to thousands of eyes makes the data better.
Iterate in response to demand.
There is no one true feed for all eternity — people need to maintain this stuff.
During the last two weeks of August I will be visiting Costa Rica for an OTS/PASI course on the use of embedded sensors for tropical ecology. Hopefully this kind of course will push forward the use of many kinds of sensors in tropical ecology. There is so much diversity in tropical forests that, in order to better understand these complex ecosystems, we need to use all the tools that are available.
This Sunday is the World Listening Day, a day set up to celebrate the practice of listening, raise awareness about issues about sound projects, and design and implement educational initiatives. The date is the birthday of R. Murray Schafer, author of The Soundscape.
Visit their website for more information and lets promote the listening of our world, there is more than car alarms, A/C systems, and noise out there.
I just found a special report from The Economist on data, “Data, data everywhere.” The report deals, in several articles, on the new trend of massive amounts of data available today. They cover mostly the business implications, but also scientific data. For example, the Large Hadron Collider:
[G]enerate 40 terabytes every second—orders of magnitude more than can be stored or analysed. So scientists collect what they can and let the rest dissipate into the ether.
Another quote that got my attention was:
Only 5% of the information that is created is “structured”, meaning it comes in a standard format of words or numbers that can be read by computers.
This means that very little of the data available can be easily imported to other computer systems for analysis. It will become very important to make data available in a way that other computers can use it, otherwise most of the time and cost will go in re-formatting data. It will be kinda like when transferring data from paper to a computer, one more time. We should make raw data available, but also raw data in structured form. A PDF is great for humans, but it sucks when trying to extract data from it. At least something like a comma-separated file should help this process a lot.
Another evident consequence is that scientists, and most notably the next generation, will need to know how to work with large amounts of data. Programming and databases will have to become part of the scientists education, so you better start sooner than later.