mfioretti: digitization*

Bookmarks on this page are managed by an admin user.

34 bookmark(s) - Sort by: Date ↓ / Title / Voting / - Bookmarks from other users for this tag

  1. Organizations use machine-readable data for a number of applications across countries as:

    A resource in the development of web and mobile products and services. Organizations create digital applications that present data in accessible ways. For instance, one agribusiness company in Ghana automates the translation of weather data and commodity prices into simple phrases that are texted to farmers in their local languages. Many organizations conduct predictive analytics and forecasting. For example, one Indian geospatial analytics company uses machine-readable geospatial and agricultural data to predict crop acreage and yields.
    A way to optimize organizational decision-making. Several organizations use machine-readable open data to inform their strategy and investments. Census, household and income surveys in particular are critical to many for targeting populations and markets. It is especially useful when disaggregated by sex, age, location and household income.
    Evidence for research and policy recommendations. Research institutions from Moldova to Zambia use machine-readable data as critical evidence to conduct analyses and support policy recommendations on issues ranging from regional and national economic development, poverty and economic integration, to health and democracy initiatives.
    A tool for advocacy on government spending, elections, and programs. For example, organizations use public. For example, one nonprofit in Ukraine uses spending data to monitor government finances and programs. Another in Nigeria uses budget data to develops infographics for citizens. Yet another provides a tool to monitor contracts, including for the extractive industries in various countries. Across regions, organizations are training journalists to use government data in their reporting, and monitor elections using open electoral commission data.

    Most of the data used is not (yet) machine-readable.

    While all the organizations in our study used machine-readable data as in their work, half of them told us that the majority of the data they need is still only available in PDFs, images, paper reports, or as website text. Over three quarters of the organizations stated formats were a barrier to data use. This is especially the case when working with large, historic and geospatial datasets. For example, organizations most benefit from geospatial data when it is highly detailed and available in shapefiles, GeoJSON, or CSV - formats that can be utilized by a computer - rather than in image form as it is too often provided. Similarly, census data is especially valuable when it can be accessed in bulk and is available in CSV or other machine-readable formats.
    Voting 0
  2. Tapes (VHS, Hi8, Etc.)
    Analog-to-digital video converters are the most common tools for digitizing your old tapes. In fact, your DV camera may have conversion capabilities. If you can input composite audio and video or s-video to your DV camera you can use it to digitize just about any analog source. All you'll need to get it onto your computer is software that can handle a DV stream. Pretty much any video editing software made in the last decade can capture DV so you won't be hard-pressed to find something. Apple's iMovie and Final Cut Express/Pro (Mac), Windows Movie Maker (Windows), and Kino (Linux) are just a few examples. If you don't have a DV Camera, you can also use TV Tuner cards with composite input or DV bridges made specifically for the purpose of converting analog video. For more information, covers the analog-to-digital conversion process in greater detail.

    While newer video recording formats tend to avoid or minimize the pitfalls of quality degradation through use, VHS tapes do not provide that luxury. You may find that when digitizing especially worn-out VHS tapes, the digital signal will cut out due to something as minor as a little jitter. This is due to a break in the timecode on the VHS tape. The simplest way around this issue is to obtain a high-quality, professional VHS deck with a time-base corrector. The time-base corrector will generate the timecode instead of the actual tape and this will prevent the jitter. While these decks were once fairly expensive, you can now find them used online for a fairly reasonable price.

    Most analog video formats can be digitized by utilizing a converter, but in the case of Hi8 tapes you have another option. Sony created Digital8 camcorders that have the ability to digitize Hi8 tapes in-camera and output a DV signal.

    Though the suggestion so far has been to save video in the DV codec, with enough video you'll need a significant amount of disk space to store it. If you're comfortable with more aggressive compression, encoding your newly digitized videos in MPEG4 or H.264 will help save a significant amount of space. Encoding at a data rate of around 2mbps and an audio data rate of 192kbps should provide you with a smaller file and a negligible loss of quality.
    Tags: , , , , by M. Fioretti (2016-03-17)
    Voting 0
  3. Getting the codec parameters right

    VHS is analogue and exhibits quite a lot of noise – and noise is not so well handled by modern compressors/codecs like xvid, h264 etc. Of course, you can store the v4l output uncompressed, raw, but prepare yourself for roughly 60-100GB of data for an hour of video. Thus we need to figure out a way to compress the digitised video and keep original quality as good as possible. In real-time, capturing TV signals is quite a challenge in terms of CPU stress, so all capturing /codec settings are always a trade-off between compression-quality and CPU speed.

    If you got a big harddrive or want to capture only a short segment of video, for best results and less problems on older systems, I would suggest a two step setup. First, capture uncompressed to a temp file, then run mencoder with a compression/codec combo that your machine normally couldn’t handle in real-time. CLI commands for that, to get you started:

    mencoder tv:// -tv channel=0:driver=v4l2:device=/dev/video0:normid=5:input=0:width=720:height=576:norm=PAL:fps=25:alsa:adevice=hw.1:forceaudio:brightness=0:contrast=0:hue=0:saturation=0:buffersize=128 -oac pcm -ovc copy -endpos 00:15:00 -o VHS1raw.avi

    This is for the “first pass”, a raw copy of the audio and video input. Note the “hw.1” part, it is the device id you get from cat /proc/asound/cards as the identifier of the stk1160’s audio device. This may change in-between boots/USB connects, depending on other soundcards in your system. Next is the buffersize, normally mencoder should adjust the buffer automatically, but giving it a forced value here seems to be no harm. Instead of using -oac copy I use -oac pcm, which is more or less the same, I think, but I once got a strange error with “copy” about frame-sizes and never again saw that with “pcm” which I think muxes better than the 16bit little endian stuff the stk1160 outputs. -endpos obviously tells mencoder to stop after 15 minutes, as that is what we want to grab here. The command should give you a low one-digit percentage CPU usage value and quite a lot to do for your harddrive.
    After that, it’s time for a “second pass”, although that is not a second pass as it is often referred to in video compression, it’s more a second, the real “transcode pass”:

    mencoder -ovc xvid -xvidencopts fixed_quant=4 -oac mp3lame -lameopts cbr:br=128 -ofps 25 -o VHS1.avi VHS1raw.avi

    So this seconds command is the real one. We encode to XVid here, with standard settings, with a “quality target” of 4 – which means a variable bitrate to reach a certain quality. Visual comparisons, for me, came out with 4 being a very good setting, with bitrates around 2500-3500 kbits for video. Later on, we’ll see that 5 is a bit faster to compress and still ok. For the audio part the command uses mp3 with a constant bitrate of 128. We could add a bit of downsampling from 48000 Hz in the original raw audio to 44.1 KHz to shave off some more bits, but a variable bitrate might be more useful to achieve that.
    On a low-range Intel Core2Duo, I get transcoding framerates between 12-14 fps, so transcoding these 15 minutes would take half an hour.

    Note that the grabbed video will have square pixels while the original TV/VHS video has non-square pixels. So adjust your player to present the video with an aspect ratio of 4:3. Otherwise the video played back on a desktop computer will be slightly compressed horizontally. Sadly, embedding this info into the file so that players adjust this automatically doesn’t work reliably.
    Voting 0
  4. Fifteen European countries have been losing momentum since 2008 in terms of their state of digital evolution – this is what we mean by a digital recession – with the Netherlands coming in dead last in our momentum rankings. European countries occupy the nine bottom spots in our list of 50. Plus, the digitally receding countries include large economies like Germany, the UK, and France, as well as Finland and Sweden, Scandinavian tech powerhouses that were the early leaders of mobile telephony. Across the rest of Europe, the state of digital evolution has been mediocre and the pace of improvement, tepid.

    This dismal performance points to a glaring – and growing — digital gap as Europeans watch the U.S. and China take the lead in tech innovation. President Obama said it plainly in a recent interview: “We have owned the Internet. Our companies have created it, expanded it, perfected it in ways that they can’t compete,” referring to the Europeans. And a recently released report suggests that Europe’s digital divide problem extends way beyond the Atlantic; Europe is a distant third behind North America and Asia for $100 million plus financing for VC backed companies.
    Voting 0
  5. I would not use public or third-party cloud as the primary backup of my data for various reasons. First of all, I have over 3 terabytes (TB) of data and it would be extremely expensive for me to buy 3 TB of cloud storage. I would have to pay over $120 per year for 1TB of data or $100 per month for 10TB on Google Drive. Cost is not the only deterrent; I will also consume huge amounts of bandwidth to access that data which may raise eyebrows from my ISP.

    The biggest danger is that then once you stop paying, you lose your data. That’s not the only problem with public cloud, the moment you start using such services your data becomes subject to numerous laws and can be accessed by government agencies without your knowledge. Your service provider gains control over your data and can lock you out of your own data for numerous reasons - most notably some ambiguous copyright violations.

    Private cloud like ownCloud or Seafile can be an option but once again, since your data left your network it is exposed to the rest of the world and, as usual, it will incur heavy bandwidth use and storage costs.

    I do use private cloud but that's mostly for the data that I want accessible outside the local network or which is shared with others. I never use it as back-up.
    Voting 0
  6. why are libraries still vital? Among other things, in Palfrey’s view, they provide access to the great equalizer of high-speed broadband, which not all communities have and is a crucial need for new immigrants and low-income families, especially those with kids, who need online access for homework.

    “A huge amount of the foot traffic is young people,” says Palfrey, who sends his history students to the library for projects. “They get assigned to go there. They’re consistently among the biggest library users.”

    Libraries also archive historical material, a task made easier and more user-friendly in the digital world. Summers says the Miami-Dade library has all of the newspapers ever published in the city on ancient microfiche, information that would be easy and relatively inexpensive to store digitally — but the library doesn’t have the money to pay for the technology or the bodies.

    “The biggest challenge at this moment is under-investment,” she says. “You’ve got to have the bodies. We had 600 librarians. Now we’re down to less than 400.”

    The Broward County Libraries Division has made some inroads in innovation with partnerships with Nova Southeastern University and local businesses, as well as through its Creation Station, a hands-on lab for learners of all ages that will expand throughout the county.

    “It really is a balancing act, to play into the history of libraries and how people viewed them and maintaining our regular book collection while also learning how to innovate and stay aware of technological advances,” says director Skye Patrick. “We’re a publicly funded agency, we don’t have endless amounts of dollars. ... But our original focus hasn’t changed. We provide free access to information as we always have. What’s changed is how the information is disseminated.”

    Palfrey also believes libraries need to exist as physical spaces
    Voting 0
  7. There are many ways to convert a video file on a Linux system, but using a tool with a graphical user interface is imperative for those who want to do it easily and in a more user friendly way. Thankfully, there are many open source GUI tools that could do the job just fine and you can find some specialization here and there if you look closely.

    My choices for this post are Curlew and Handbrake, two easy to use video converters that can do a lot more than just that, and in the same time two different approaches aimed for different tastes and needs.
    Voting 0
  8. I have dozens of VHS tapes recorded in some cases nearly 30 years ago. I wanted to digitize some of the content and share it with my children via YouTube. I have a digital video camera that's almost ten years old. I also have a VHS/DVD player recorder that I originally purchased to dub these videos. Rather than use the recording deck to create a DVD that I would then use Handbrake to turn into MP4s, I decided to try to use my digital video camera as a passthrough device connected to my laptop.
    Voting 0
  9. Today, you can still go visit the Easter Island statues, reveling in the effort it must have taken to construct and move them to their current positions. Other ruins, though, you may never have the chance to see with your own eyes. In recent months, ISIS has essentially bulldozed the ancient Assyrian city of Nimrud, destroying artifacts that date back to the 13th century BC. And they’re currently knocking on the door of the 2,000-year-old Syrian city of Palmyra.

    In a world that consistently threatens humans’ valuable archaeological heritage, new digital preservation technologies can offer a small (very, very small) comfort. CyArk, a nonprofit organization in California, has digitally captured many sites and objects, including those ancient maoi on Rapa Nui, through LiDAR (light detection and ranging) data. The three-dimensional scanned images are helping the island’s modern inhabitants record their cultural history for posterity—and aiding researchers in their study of the past.
    Voting 0
  10. By defining structured information about the world, gazetteers have the power to shape and structure how geographic meaning is made. There are hundreds of millions of requests for geographic information from GeoNames each month, such as the New York Times using the gazetteer to link articles to places. This means that the biases in gazetteers influence how we are able to understand all sorts of other data that we use in everyday life.

    Gazetteers are gatekeepers to knowledge of place. By not appearing in gazetteers, places are unlikely to ever become present and visible in other geocoded datasets. And because so much additional research, analysis, and visualisation by relies on using large gazetteers like GeoNames, the biases that we see here are only likely to be propagated throughout our digital ecosystem.

    This research shows that we need to question the very ground-truths that we’re using to create and understand geographic data and services: because geographic data has its own uneven geographies.
    Voting 0

Top of the page

First / Previous / Next / Last / Page 1 of 4 Online Bookmarks of M. Fioretti: Tags: digitization

About - Propulsed by SemanticScuttle