Design, photography and ramblings

Month: March 2016

RedCap data dictionary upload error

ERROR:
For unknown reasons, the Data Dictionary could not be uploaded successfully to the project. Please try uploading it again, and check with your REDCap administrator if this issue persists.


The yesterday I received this error message while trying to upload a data dictionary file to RedCap.  This error appears up when there are line breaks where they shouldn’t be, special characters (such as ‘EN Dash‘) or characters encoded using something other than UTF-8 encoding, which RedCap requires.

The data dictionary file I was trying to use was a CSV file generated from Excel.  I specifically told Excel to encode the file using UTF-8, but it didn’t listen and did not convert my unicode characters to UTF-8.  My suggestion is to either use Google Spreadsheets or OpenOffice.  I used OpenOffice Calc, and it worked like a charm.  Also Calc lets you specify that all text cells be wrapped in quotes which helps.

If encoding is not your issue

If you know you CSV file is encoded correctly (loading your file into BeyondCompare will tell you), then you need to start looking at the file in a text editor (NOT Excel).  To troubleshoot, I open the CSV file in a text editor (such as NotePad2 or Notepad++) so I can see if any of the lines are breaking or some other weirdness.   I begin testing by cutting the last three quarters of the  lines from the data dictionary file, save, then try to upload to RedCap.  If that works then I add in about 20 more lines, save, then upload again.  I continued this until I find the line that is causing problems.

Recode multiple columns in Excel using VLOOKUP

Download the Excel VLookup multiple column example mentioned below

Say you need to recode several columns of data with the same LOOKUP values.  You can do this by simply copying the table you want recoded (in the example file we are copying the RawData sheet to another sheet we have named ‘KOOS-recode’), and then applying a function shown below to the columns needing to be recoded in the KOOS-recode table.

The recode function we are using is listed below:
=VLOOKUP(INDIRECT("RawData["&INDIRECT(CONCATENATE( SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1",""),1))&"]"), 'KOOS-lookup-recode'!$A$1:$B$5, 2, FALSE)

The logic of this function is simple.  We use two copies of the same table since we can not directly run this function on the table data we want recoded/changed.  So we use the first table as our original ‘raw’ data.  This function gets the name of the column it is being run under (and the cell) and gets the corresponding cell value from the original table, then runs a VLOOKUP on the value to give you the recoded data.

I arrived at this function by first figuring out how to get the currently selected cell names and the currently selected column name.  I figured once I had that I could figure out how to get the corresponding value from the original table in order to transform it.  Somehow I was able to get the syntax correct (which is convoluted to say the least).

Some other Excel functions to get current table and cell information (used in the building of the function above)

Please note, the column/table header names must be the same in both tables

Get the current cell address
=CONCATENATE(SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1",""),ROW())

Get the currently selected column header (value)
=INDIRECT(CONCATENATE(SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1",""),1))

Get values from another table (RawData) to pull into your current table
=INDIRECT("RawData["&INDIRECT(CONCATENATE(SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1",""),1))&"]")

 

 

Free software resources for UNC employees

I thought I would share two software resources available to you that you probably are not aware of and are fully supported by UNC IT helpdesk.  These resources are similar in that they allow you to run software (such as Photoshop or SAS) on your computer without having to install or buy a license.

The first resource is Virtual Lab (https://virtuallab.unc.edu/).  This gives you access to software from companies such as Adobe (Acrobat, Photoshop, Illustrator), SAS (JMP, SAS 9.x, Enterprise Guide), Microsoft (Word, Visio, Excel, Project…) to name a few.  Most popular statistical programs (such as STATA and SPSS) are also available, and you do not need to install these programs to use them.  This resource is nice it you need to use a program such as Photoshop for a one time project and don’t want to bother with installing or buying a license.  Simply log onto the site to view the full listing of applications available to access directly from your computer.  I have used it to convert STATA files to SAS for example since I do not have STATA installed on my machine and do not normally have a need for it.
Running any of the programs from this service requires your computer to do the processing (as any program installed on your computer normally does).  So if you have a large Photoshop file you are editing then your computer will need to have the memory and processing power to handle it.

The second resource is VCL (Virtual Computing Lab) https://vcl.unc.edu/. This resource basically gives you access to another computer to do your processing from your own computer.  This is good for intensive statistical analysis processing or the case I mentioned about large Photoshop file editing.  With this service you make an online reservation (like reserving a computer at the library), then you are given access to connect remotely to the computer using Remote Desktop.  Most of these virtual computers are only loaded with statistical programs such as SAS, but if you wanted to get fancy with it you could use this service to log into a virtual computer and then open a program using the Virtual Lab service within that virtual machine.  You can even go so far as setting up your own custom virtual machine (as I did for a project I am working on here at TARC) and install your own software on it for use by selected members of your research team/group.

Of course if you simply want your own versions of software to install on your machine there is http://software.sites.unc.edu/software/ where you can pick up some versions of software for free, such as SAS (for research use).

Beware the URL

Every one of us has probably gotten that reminder from your organization Help Desk telling you to please avoid clicking on phishing links in your emails.  That is all well and good but URL’s are mysterious strings of characters which were written in an alien language that few understand.  Below are a few things that are important to know if you want to learn some of the mysteries of the URL to help you from becoming a victim of online phishing or malware.

How to quickly spot a phishing email (hover your cursor over ALL links)

Just a quick tip on spotting phishing emails. Before clicking on ANY link in an email (or website), place your cursor over the link and look at the web address for the link that appears in the lower left-hand corner of the window (shown in the image below). For instance if you hover over this link you should see http://unc.edu (or simply unc.edu). Phishing emails are usually reported to campus ITS before they even reach your email box so you can always check to see if a suspicious email has been reported by going to http://its.unc.edu/phish-alerts/ (again hover over this link to see where it is going before clicking on it).

phishing

 

Don’t trust online URL scam checker sites

To begin, there are no good sites out there for checking a URL (web address or website) to determine if it may be a phishing or malware site.   A recent URL sent through email as a phishing attempt is a case in point (http…://connectcarolina.c0.pl/sso.unc.edu/idp/Authn/UserPassword/…).  I changed the URL to keep you from clicking on it by adding … to it, so you are safe.  Anyway, if you pasted the URL (without the …) into a service such as scanurl.net, safeweb.norton.com, or urlvoid.com (to name a few) it would probably say the site was safe.  Don’t believe them.

First of all, none of these ‘verifier’ sites build in enough artificial intelligence to effectively read the website content and code to determine the ‘intent’ of a page.  I tried several of these services and it said the content came from an education website and was minimal risk (only because the content of the page had references to UNC and probably links back to UNC websites).  Minimal risk means nothing here.

Who and what do you trust?

Well for starters, you can not always trust the sender.  Someone sending phishing emails can easily write a program that will send you an email that looks like it is coming from any legit email address (like your mom or your organizational help desk).  It is not difficult at all.

The most useful thing to know is how to read and recognize the domain/website the URL is referring to (Google ‘How to Read a Web Address‘ for additional reading).  In the URL above we are only interested in the domain which is made up of the name  of the domain ‘c0’ and the extension of the domain ‘.pl’.  The ‘connectcarolina’  and ‘sso.unc.edu/idp/Authn/UserPassword’ parts of the URL can simply be thought of as folders within the c0.pl website.  The domain is always the last two strings joined by a period between the http:// part and the first / in the URL.  So in the URL http://pgale.web.unc.edu/homepage, the domain is simply unc.edu (again the last period joining two strings before the first /).

Now that we can recognize the domain name in a URL, things are a bit easier.  An educational domain (like unc.edu) or reputable company domain (like google.com) most likely will impose harsh penalties for posting malicious webpages on their websites or have restrictions to help prevent it, and probably any domain hosted in the USA.  If you Google the domain ‘c0.pl’ you will see that the domain service is for ‘CBA.PL – Najlepszy i najpopularniejszy darmowy hosting’.  I have no idea what ‘Najlepszy i najpopularniejszy darmowy’ means.  If you can’t recognize the domain then that is a red flag to avoid the site.

To complicate things

Some things to watch out for are, first, your email client/program.  If you use OWA (Outlook Web App.) for instance, then it may be using URL rewriting to make all the links in your emails read as https://outlook.unc.edu/owa/redir.aspx?SURL=… (with a lot of jibberish at the end).  This is a problem because it invalidates what I just mentioned above about trusting the domain of the URL.  In this case it looks like the URL domain is unc.edu right? Well it is, however that is simply the starting point.

This brings me redirects.  Most of you may be aware of URL link shortener services.  Here at UNC there is the service http://go.unc.edu/.  So if I wanted to create a URL redirect to my website with a shorter URL it may look like http://go.unc.edu/r3BFg which redirects you to ‘https://pgale.web.unc.edu/’.   Do you see the problem here?  You don’t know what the final destination of the URL is!  At this point an online tool that checks your links to show you the final destination is helpful, such as http://www.internetofficer.com/seo-tool/redirect-check/.  If you copy http://go.unc.edu/r3BFg and paste it into the URL checker it will show you the URL is redirected to https://pgale.web.unc.edu/.  I would trust that site (since it is mine).

Other tools to help if you happen to go to a malware site

So say you learned all of the tricks above but still ended up a malicious website by mistake.  One way to protect yourself even if you do end up on a malicious website (through a website search or email link) is to use a script blocking tool.  Scripts are things on websites for displaying content (videos and images), but also adding bad things to your computer in some cases.  If you use the Firefox browser, a tool of choice for me is NoScript (https://addons.mozilla.org/en-US/firefox/addon/noscript/).  Now I realize most people would lose their patience with this plugin since it forces you to verify domains you trust, but it is worth learning.  Even if I were to browse to a website with malicious software embedded in it, it is highly unlikely I would get infected.  I would be able to leave the site before it caused harm to my computer since I can choose the sites I will accept scripts from (for displaying everything from videos to images).  There are similar tools for other browsers such as Chrome, but I’m a Firefox kind of guy myself.  At any rate, I feel NoScript is one of the most useful tools for helping to prevent you from bad websites online.

I will leave it at that.  There is enough on this page to make your head explode I know, but hopefully you were able to learn something from my experiences.