datascience

Changes in python objects

I am enthusiast programmer, thus I am no a professional. I wanna getting better and some times I write some notes to improve my English and programming skills. Certainly you will find some mistakes in the text and programming concepts in this post, I am sorry, I am trying. Lets go to what matter now! The word change is ambiguous in Python, it means that we have two distinct types of “change” in Python.

Tips for Python - Numpy and Pandas library

Introduction Saving time for cleaning, tidying and processing data is quite useful in data science. It means that you can get more time to analyses and think about solutions. If I working Data Science with Python, usually I am using Pandas and Numpy library. It is a great library with a lot smart functions. However, sometimes I forget some functions and write my own functions to solve calculations. For practicing it is cool, but it spends some time.

PCA analysis and tidy data

Introduction PCA (principal components analysis) is multivariate statistical method that concerned with examination of several variables simultaneously. The idea is provide a dimensionality reduction of data sets, finding the most representative variables to explain some phenomenon. Thus the PCA analyses inter-relation among variables and explains them by its inherent dimensions (components). If there is NO correlation among the variables, PCA analysis will not be useful. This post is divided in two section.

Filtering objects from a workdir

Introduction Usually, if you work with in data science field, you need to create several objects, like list, DataFrame, matrix, etc. Besides, you have to handle some files to read and make copies. For that and other actions, sometime you have to select some objects from your workdir. The goal in this post is to show some options about how to filter objects from your workdir. Listing and Filtering your objects There are several options to list the objects from a workdir.

lucyLattes um script para manipular dados da plataforma Lattes

Introdução Historicamente o CNPq gerencia uma base dados sobre pesquisadores em C&T para diversos fins, cita-se como exemplo a avaliação de programas de pós-graduação, seleção de bolsas para pesquisadores, entre outros. Esta base dados é denominada Plataforma Lattes. Devido esta referida plataforma ser amplamente utilizada, tornou-se padrão em universidades, órgãos de pesquisa, etc. Nesta plataforma é possível encontrar desde a formação acadêmica do profissional, as empresas que trabalhou, até sua produção científica, e artística, etc.

Python beginner tips

Introduction I spent around 4 years working with R, and I can say that R is amazing. However, I decided to start a new stage in my professional life, and a need came, a multi proposal language. My option was Python, because It provides have data science tools, and has a huge support for others applications. After some months, studying in my free time, I would like to share some code that I learned (from several sources).

Lendo e armazenando uma série de arquivos

Introdução Com frequência há necessidade de acessarmos uma série arquivos para gerar um banco de dados. Quando se tem dois ou três arquivos, pode-se dizer que escrever algumas linhas não demora. Contudo, quando se tem mais de três arquivos, seja ele .csv, .dat, etc., gasta-se tempo que poderia ser investido em outra ação. Na linguagem R, uma das opções de se realizar esta operação é a utilização de listas. Estudo de caso I Vamos supor que você tenha uma série de arquivos no formato .