posters-1

Poster session 1

1 - 1

(De)Constructing Bias on Skin Lesion Datasets

Alceu Bissoto, Michel Fornaciali, Eduardo Valle, Sandra Avila

Alceu Emanuel Bissoto

Melanoma is the deadliest form of skin cancer. Automated skin lesion analysis plays an important role for early detection. Nowadays, the ISIC Archive and the Atlas of Dermoscopy dataset are the most employed skin lesion sources to benchmark deep-learning based tools. However, all datasets contain biases, often unintentional, due to how they were acquired and annotated. Those biases distort the performance of machine-learning models, creating spurious correlations that the models can unfairly exploit, or, contrarily destroying cogent correlations that the models could learn. We propose a set of experiments to investigate bias, positive and negative, in existing skin lesion datasets. We show that models can correctly classify skin lesion images without clinically-meaningful information: disturbingly, the machine-learning model learned over images where no information about the lesion remains, presents an accuracy above the AI benchmark curated with dermatologists' performances. That strongly suggests spurious correlations guiding the models. We fed models with additional clinically meaningful information, which failed to improve the results even slightly, suggesting the destruction of cogent correlations. Our main findings raise awareness of the limitations of models trained and evaluated in small datasets, and may suggest future guidelines for models intended for real-world deployment.

bias, skin lesion analysis, deep learning

1 - 2

A Latent Space of Protein Contact Map's

Emerson C. F. Lima, Fábio L. Custódio, Laurent E. Dardenne

Emerson Correia Freitas Lima

"Finding a good homologous protein is crucial to predicting a target protein structure with good quality. The search for remote homologous is given by looking for target's neighbors in a given protein space. Deep Convolutional Generative Adversarial Networks (DCGANs) are deep learning models capable of learning meaningful embedded representation of data. Current methods are based on sequence alignments or contacts alignments. In this work, we build a latent space of protein folds through DCGANs with the aim of contributing to the problem of remote homology detection."

protein remote homology detection, computational biology, generative adversarial networks

1 - 3

A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing

Elizabeth Fons, Paula Dawson, Jeffrey Yau, Xiao-jun Zeng, John Keane

Elizabeth Fons

The financial crisis of 2008 generated interest in more transparent, rules-based strategies for portfolio construction, with smart beta strategies emerging as a trend among institutional investors. Whilst they perform well in the long run, these strategies often suffer from severe short-term drawdown (peak-to-trough decline) with fluctuating performance across cycles. To manage short term risk (cyclicality and underperformance), we build a dynamic asset allocation system using Hidden Markov Models (HMMs). We use a variety of portfolio construction techniques to test our smart beta strategies and the resulting portfolios show an improvement in risk-adjusted returns, especially on more return oriented portfolios (up to 50$\%$ of return in excess of market adjusted by relative risk annually). In addition, we propose a novel smart beta allocation system based on the Feature Saliency HMM (FSHMM) algorithm that performs feature selection simultaneously with the training of the HMM, to improve regime identification. We evaluate our systematic trading system with real life assets using MSCI indices; further, the results (up to 60$\%$ of return in excess of market adjusted by relative risk annually) show model performance improvement with respect to portfolios built using full feature HMMs.

hidden markov model, portfolio optimization, feature selection

1 - 4

AI education in Latinamerica

Lesly Zerna

Short research about different initiatives in Latin America that are willing and working to democratize AI education (foundations). Many different communities in Latin America, tech communities, lead by volunteers are seeking to help people to learn and apply AI knowledge

education, ed tech, ai education

1 - 5

Automatic emotion recognition from psychophysiological data: a preliminary bilateral electrodermal activity study.

Tomás Ariel D’Amelio & Alberto Andrés Iorio

Tomás Ariel D'Amelio

"Affective computing as a field of study has the objective of incorporating information about emotions and other affective states into technology. One of its areas of study is the automatic recognition of emotions. This can be achieved by different means, one of them being the application of psychophysiological methods. The aim of the present poster is to present the implementation of models of emotional recognition from bilateral electrodermal activity signals. In this way, the impact of introducing new bilateral features will be analyzed as a possible contribution to the existing affective state recognition models. "

affective computing, emotion recognition, bilateral electrodermal activity

1 - 6

Multiscale Pain Intensity Estimation from Facial Expressions using CNN-RNN

Jefferson Quispe Pinares, Guillermo Cámara Chávez, Rensso Victor Hugo Mora Colque

Jefferson Quispe Pinares

Deep Learning methods have achieved impressive results in several complex tasks such as pain estimation from facial expressions in video sequences. Estimation pain has a difficult way to measure, it due to subjective and specifics features by each person. However, its estimation is important for clinical evaluation processes. This research paper proposes the use of Convolutional Neural Networks (CNN) with the Transfer Learning and Sequence Model using Gated Recurrent Units (GRU) in order to get an accurate pain estimate in different scales. Before this, a preprocessing is performed using the landmarks. Nowadays, Prkachin and Salomon Pain Intensity (PSPI) based on Action Units (AU) have been investigated too much, but there is more scale to estimate the pain. For a correct estimation of the automatic intensity of pain for video, we use Visual Analog Score (VAS), and other scales; which allows us to achieve results taking into account the evaluation metric used by specialists.

pain, personalized, deeplearning

1 - 7

Bayesian encoding and decoding properties of neurons in the dentate gyrus of the hippocampus

Diego M. Arribas, Antonia Marin Burgin, Luis G. Morelli

Diego Arribas

The intrinsic properties of neurons in a population are diverse and distinct outputs may arise from this heterogeneity, reducing redundancy in the population. Due to the continuous birth of neurons in the dentate gyrus, neurons of different ages receive, process and convey information at any given point in time. While maturing, neurons develop their intrinsic properties in a stereotyped way producing heterogeneity in the population. We hypothesize that young neurons play an active role in the processing of information already in their immature state. We study how neurons of different ages transform their input into a spiking response by performing patch clamp recordings and injecting fluctuating currents. By fitting Bayesian models called Generalized Linear Models to our data, we can predict responses for a given stimulus with a high degree of accuracy while getting a reduced characterization of the recorded neurons. We use these characterizations to compare the encoding properties of different age populations. How do different neurons represent stimuli? We explore this question by using the neurons encoding models and recorded responses to estimate stimuli through model-based decoding. We can explore what features of the stimuli are preserved in the different neurons response and compute information measures.

neuroscience, bayes, coding

1 - 8

Bayesian model of human visual search in natural images

Melanie Sclar, Sebastián Vita, Gastón Bujia, Guillermo Solovey, Juan E Kamienkowski

Melanie Sclar

The ability to efficiently find objects in the visual field is essential for a number of everyday life activities. In the last decades, there was a large development of models that accurately predict the most likely fixation locations (saliency maps), which is reaching its ceiling. Today, one of the biggest challenges in the field is to go beyond saliency maps to predict a sequence of fixations related to a given task. Visual search can be seen as an active process in which humans update the probability of finding a target at each point in space as they acquire more information. In this work, we build a Bayesian model for visual search in natural images. The model takes a saliency map as prior and computes the most likely fixation location given all the previous ones, taking several visual features into account. We recorded eye-tracking while participants looked for an object in a natural interior image. Our model was indistinguishable from humans in several measures of human behavior, in particular, the Scanpath Similarity. Thus, we were able to reproduce not only the general performance but also the entire sequence of eye movements.

visual search, neuroscience, bayesian modeling

1 - 9

Bayesian multilevel models for assessing the quality of NMR resolved protein structures

Agustina Arroyuelo, Jorge A. Vila, Osvaldo A. Martin

Agustina Arroyuelo

"On the present work we exploit the benefits of multilevel Bayesian models trained with protein structural observables, with the purpose of protein structure validation. The Bayesian models trained in this work allow us to estimate a reference for the observables that is unique for each protein structure thus revising the computation of these values. Through a data set of high quality protein structures we obtain reference curves for the expected differences between observed and corrected magnitudes, and benchmark NMR resolved structures against these curves. We also present a computational tool for graphic validation of NMR resolved protein structures based on the presented Bayesian framework. "

bayesian, multilevel models, protein structure

1 - 10

Combining Deep Learning and Prior Knowledge for Crop Mapping in Tropical Regions from Multitemporal SAR Image Sequences

Laura Elena Cué La Rosa, Raul Queiroz Feitosa, Patrick Nigri Happ, Ieda Del’Arco Sanches 2 and Gilson Alexandre Ostwald Pedro da Costa

Laura Elena Cué La Rosa

Accurate crop type identification and crop area estimation from remote sensing data in tropical regions are still considered challenging tasks. The more favorable weather conditions, in comparison to the characteristic conditions of temperate regions, permit higher flexibility in land use, planning, and management, which implies complex crop dynamics. Moreover, the frequent cloud cover prevents the use of optical data during large periods of the year, making SAR data an attractive alternative for crop mapping in tropical regions. This paper evaluates the effectiveness of Deep Learning (DL) techniques for crop recognition from multi-date SAR images from tropical regions. Three DL strategies are investigated: autoencoders, convolutional neural networks, and fully-convolutional networks. The paper further proposes a post-classification technique to enforce prior knowledge about crop dynamics in the target area. Experiments conducted on a Sentinel-1 multitemporal sequence of a tropical region in Brazil reveal the pros and cons of the tested methods. In our experiments, the proposed crop dynamics model was able to correct up to 16.5% of classification errors and managed to improve the performance up to 3.2% and 8.7% in terms of overall accuracy and average F1-score, respectively.

crop mapping, multitemporal image analysis, deep learning

1 - 11

Convolutional-LSTM for multi image to single medical diagnostics

Luis Leal,Erick Ramirez,Fernando Juarez, Diana Letona,Mildred Aspuac,Marvin Castillo

Luis Leal

Deep learning has been successfully applied to computer vision medical problems commonly generating a medical diagnostic per image(in a supervised learning setting) and then combining the diagnostics using statistical techniques for a general diagnostic per patient , this requires a training set with a diagnostic per image but for many medical situations like head scans it is uncommon to have 1 diagnostic per image instead doctors emit a single diagnostic for the patient based on an unknown and variable number of images for every patient , we designed a convolutional-lstm architecture and a variation of stochastic gradient descent training pipeline to create a model for head scans medical diagnostics that is able to take a sequence of ct scans(with unknown and variable size) and emit a single diagnostic per patient trained on a multi-image single diagnostic setting.

medical,computer-vision,sequence-model

1 - 12

Data analytics opportunities for an education service provider: The case of Plan Ceibal

Germán Capdehourat, Federica Bascans, Fabián Frommel

Germán Capdehourat

"Plan Ceibal was created in 2007 as a plan for inclusion and equal opportunities with the aim of supporting Uruguayan educational policies with technology. Since then, it became the first nationwide ubiquitous educational computer program in the world based on the 1:1 model, providing every student and teacher in the K-12 public education system with a laptop or tablet and internet access at each school. In addition, Plan Ceibal also provides digital educational content and resources to enhance the teaching and learning process, most notably a Learning Management System, a Mathematics Adaptive Platform, a remote English teaching program and an online library. Today, with close to 700,000 beneficiaries, Plan Ceibal manages a large amount of data collected from many different sources such as the end user devices, the network infrastructure and the educational platforms. This fact presents a great challenge, but also the huge opportunity to convert the massive data into rich information, which can be used to support and improve the current technology and learning educational policies."

education, laptops, wi-fi

1 - 13

Decoding neurophysiological information from Convolutional Neural Network layers in single-trial EEG classification

Ramiro Gatti, Yanina Atum and José Biurrun Manresa

Ramiro Gatti

Predicting movement from brain signals is crucial for many biomedical applications. Careful engineering and domain expertise are required to perform electroencephalogram (EEG) feature extraction into a suitable representation for the classification stage. Representation learning methods can automatically perform feature extraction and classification through optimization algorithms. In particular, Convolutional Neural Networks (ConvNet) recently showed promising results in prediction of movement speed and force from single-trial EEG. The downside of this approach is that it does not provide direct insight into neurophysiological phenomena underlying a decision. In this regard, there has been some progress recently in the field of network visualization, but more research is still required in order to come up with strategies for extracting neurophysiological information. In this work, we analyzed the resulting layers of ConvNets that were trained to predict movement from single-trial EEG recorded from several channels over the motor cortex. Our results show that the discriminative information is predominantly decoded in the spatial filter layer, and that the network structure can be substantially reduced by taking this knowledge into consideration.

neurophysiology, convolutional neural network, single-trial eeg

1 - 14

Deep Learning based parameterization for lightning prediction

Mailén Gómez Mayol, Luciano Vidal, Pablo Mininni

Mailén Gómez Mayol

Lightning poses a serious hazard to the public and is responsible for several fatalities and damages, therefore, it is important to improve the community's ability to forecast lightning. Predicting the spatio-temporal location of lightning is a complex problem. The usual methods for forecasting meteorological phenomena are based on the numerical integration of the equations that describe them in a spatial grid. Lightning occurs in time and space scales smaller than the usual grids, and therefore a "parameterization" is used. The nature of lightning leads us to try a new approach: deep neural networks. In this work, we train a deep neural network to generate an empirical parameterization of lightning from available data. In particular, we present a model based on the training of a GAN with numerical forecast model data and lightning observations, without knowing the equations that govern the phenomenon. The network thus trained is capable of returning the lightning rate from physical variables that describe the state of the atmosphere. Comparisons of the network output with a lightning parameterization commonly used in the numerical weather forecast show that the deep learning model can match or improve the results obtained by the usual methods.

lightning prediction, gan, atmosphere model

1 - 15

Deep Learning for Image Sequence Classification of Astronomical Events

Rodrigo Carrasco-Davis, Guillermo Cabrera-Vives, Francisco Förster, Pablo A. Estévez, Pablo Huijse, Pavlos Protopapas, Ignacio Reyes, Jorge Martínez-Palomera, Cristóbal Donoso

Rodrigo Carrasco Davis

We propose a new sequential classification model for astronomical objects based on a recurrent convolutional neural network (RCNN) which uses sequences of images as inputs. This approach avoids the computation of light curves or difference images. This is the first time that sequences of images are used directly for the classification of variable objects in astronomy. We generate synthetic image sequences which take into account the instrumental and observing conditions, obtaining a realistic set of movies for each astronomical object. The simulated data set is used to train our RCNN classifier. Using a simulated data set is faster and more adaptable to different surveys and classification tasks. Furthermore, the simulated data set distribution is close enough to the real distribution, so the RCNN with fine tuning has a similar performance on the real data set compared to the light curve classifier. The RCNN approach presents several advantages, such as a reduction of the data pre-processing, faster online evaluation, and easier performance improvement using a few real data samples. The results obtained encourage us to use the proposed method for astronomical alert broker systems that will process alert streams generated by new telescopes such as the Large Synoptic Survey Telescope.

astronomy, image simulation, fine tuning

1 - 16

DockThor-VS: A Free Docking Server For Protein-Ligand Virtual Screening using the Supercomputer SDumont

Isabella A Guedes, André M. S. Barreto, Eduardo K da Silva, Camila S Magalhães and Laurent E Dardenne

Isabella Alvim Guedes

Receptor-ligand molecular docking is a structure-based approach widely used by the scientific community in Medicinal Chemistry to assist the process of drug discovery, searching for new lead compounds against relevant therapeutic targets with known three-dimensional structures. The DockThor program, developed by our group GMMSB/LNCC, has obtained promising results in comparative studies with other well-established docking programs for pose prediction of distinct ligand chemical classes and molecular targets. Recently, we developed machine learning-based scoring functions5 with protein-ligand interaction-driven features for predicting binding affinities of protein-ligand complexes. The competitive performance of the DockThor program for binding mode prediction and the accuracy of the affinity functions recently developed, encouraged us to develop the portal DockThor-VS as a free and reliable tool for virtual screening. The DockThor-VS portal utilizes the computational facilities provided by the SINAPAD Brazilian High-performance Platform and the petaflop supercomputer SDumont (freely available for the scientific community at http://www.dockthor.lncc.br).

drug design, molecular modeling, machine learning-based affinity prediction

1 - 17

Estimating deforestation events in the Semiarid Chaco Forest in Argentina using GIS, remote sensing and machine learning models.

Veronica Barraza, Vanesa Douna, Grings Francisco, Esteban Roitberg, Estefania Piegari

veronica Barraza Berradas

Semi-arid forest ecosystems play an important role in seasonal carbon cycle dynamics; however, these ecosystems are prone to heavy degradation. In subtropical Argentina, the Chaco region has the highest absolute deforestation rates in the country (200.000 ha/ year), and at the same time, it is the least represented ecoregion in the national protected areas system. There is a critical need for methods that enable the analysis of satellite image time series to detect forest disturbances, especially in developing countries (e.g. Argentina). The Forest Management Unit (UMSEF) in Argentina provides annual deforestation maps based on visual inspection of Landsat images (Landsat 7 ETM+ and Landsat 8 OLI), which take long processing times and the intensive and coordinated participation of many human resources. In this research, we assess the potential to use Random Forest (RF) algorithm with the Landsat dataset and geographic information system (GIS) information to detect cover change over the Dry Chaco Forest (DCF) in Argentina. To identify the factors that define the agricultural expansion we calculated feature importances. Results indicate that distance to previous deforestation areas, distance to rivers and remote sensing vegetation indices are sufficient to predict deforestation events.

deforestation, random forest, remote sensing

1 - 18

Evaluation of methodologies for classifying mutations in genomics

Camila Simoes, Lucía Spangenberg, Hugo Naya, Juan Cardelino

Camila Simoes

"Human genome sequencing has become a frequent tool in the clinical practice, facilitating the determination of a large number of genetic variants. The interpretation of these variants remains a great challenge and even though the development of rules and tools for variant interpretation has increased, many variants remain unclassified or with conflicting interpretation of pathogenicity . The ClinVar public database has become an indispensable resource for clinical variant interpretation, where clinical laboratories, researchers and others share their classifications of variants with evidence, documenting the clinical impact of more than 400,000 variants. ClinVar uses standard terms for pathogenicity level (recommended by ACMG/AMP), and differences in interpretation among submitters within those levels are reported as 'Conflicting interpretations of pathogenicity’. In this sense, the goal of this work is to carry out an evaluation of several Machine Learning techniques in order to reclassify variants with conflicting interpretations. To that end, we use the variants classified as ‘Benign’, ‘Likely Benign’, ‘Pathogenic’ and ‘Likely pathogenic’ from the ClinVar database, previously annotated with ANNOVAR as training set. We believe this approach could be helpful to disambiguate the interpretation of genomic variants, and improving the analysis of variants in pursuit of new insights into pathogenicity. "

genomics, variant classification, conflicting variants

1 - 19

Evaluation of the use of artificial neural networks in the classification of diabetic retinopathy with retinal fundus images obtained using smartphones

"Marilia Rosa Silveira, Juliana Herbert, Manuel Augusto Pereira Vilela "

Marilia Rosa Silveira

Diabetic retinopathy (DR) is a microvascular complication resulting from occlusion of retinal vessels caused by diabetes. The fundoscopy is used to identify clinical findings to classify DR in degrees according to the evolution of the findings. Machine Learning (ML) techniques, such as Convolutional Neural Networks (CNNs), have been used to recognize patterns in images. This project aims to apply a machine learning approach to classify fundoscopy images in order to establish priority care for an ophthalmology specialist, according to clinical imaging characteristics that converge to a higher or lower risk of blindness related to diabetic retinopathy. The project is a final work of Biomedical Informatics undergraduate course. We are doing a systematic review to identify, appraise and synthesize studies about analysis of fundoscopy images using artificial neural networks to identify and classify DR. At the same time, we are preparing the database with annotated images, image preprocessing, training and validation of the neural network. The database is composed by classified images obtained by the project "Analysis of Retinal Images Obtained with a Mobile Phone" approved by the Ethics Committee of Santa Casa de Misericórdia Hospital, at Porto Alegre city, as well as public data to complement this base.

fundus image, diabetic retinopathy, deep learning

1 - 20

External validation and characterization of cancer subtypes using SBC

Duitama, C. ; Ahmad, A.; Fröhlich,H.

Camila Duitama

The Survival Based Bayesian Clustering (SBC) by Ahmad and Frohlich (2017), infers clinically relevant cancer subtypes, by jointly clustering molecular and survival data. Originally the model was tested on real Breast Cancer and Glioblastoma (GBM) data sets, without external validation. The main objective of this project was to perform an external validation of the SBC based on the Verhaak samples along with a rigorous feature engineering process, and to characterise the clusters and signature with other clinical and omics data. A patient cohort of 421 samples (160 training, 261 validation) from the TCGA-GBM data set was retrieved with RTCGAToolbox and pre-processed. The feature engineering approaches with most distinct Kaplan Meier curves were Block HSIC-Lasso (p-value training = 1.08e-05, p-value validation= 0.05) and a PAFT model on a collection of oncogenic gene sets (p-value training =e+00 , p-val validation= 1.8e-02). In both cases there was an improvement of the initial Predictive C-Index (Block HSIC-Lasso = +1:5%, PAFT = +27:6%) and Recovery C-Index (Block HSICLasso = +8:7%, PAFT = +5:0%). The SBC has proven to perform successfully on an external TCGA-GBM patient cohort.

survival, glioblastoma, clustering

1 - 21

Flavor tagging algorithms for Jets in ATLAS, at CERN

ATLAS Collaboration

Maria Roberta Devesa

The identification of jets containing a b-quark (b-tagging) is an important component in the physics program of the ATLAS experiment at CERN. Several searches for New Physics increase significantly their sensitivity when identifying these jets. b-tagging algorithms are presented, some of them using machine learning techniques, and their performances are compared

b-tagging, atlas, b-jets

1 - 22

Fraud Detection in Electric Power Distribution: An Approach that Maximizes the Economic Return.

Pablo Massaferro

The detection of Non-Technical Losses is a very important economic issue for Power Utilities. Diverse machine learning strategies have been proposed to support electric power companies tackling this problem. Methods performance is often measured using standard cost-insensitive metrics such as the accuracy, true positive ratio, AUC, or F1. In contrast, we propose to design a NTL detection solution that maximizes the effective economic return. To that end, both the income recovered and the inspection cost are considered. Furthermore, the proposed framework can be used to design the infrastructure of the division in charge of performing customers inspections. Then assisting not only short term decisions, e.g., which customer should be inspected first, but also the elaboration of long term strategies, e.g., planning of NTL company budget. The problem is formulated in a Bayesian risk framework. Experimental validation is presented using a large dataset of real users from the Uruguayan Utility (UTE). The results obtained show that the proposed method can boost companies profit and provide a highly efficient and realistic countermeasure to NTL. Moreover, the proposed pipeline is general and can be easily adapted to other practical problems.

1 - 23

High-throughput phenotyping of plant roots in temporal series of images using deep learning

Nicolas Gaggion, Thomas Roule, Martin Crespi, Federico Ariel, Thomas Blein, Enzo Ferrante

Rafael Nicolas Gaggion Zulpo

Root segmentation in plant images is a crucial step when performing high-throughput plant phenotyping. This task is usually performed in a manual or semi-automatic way, deliniating the root in pictures of plants growing vertically on the surface of a semisolid agarized medium. Temporal phenotyping is generally avoided due to technical limitations to capture such pictures during time. In this project, we employ a low cost device composed of plastic parts generated using a 3D printer, low-price cameras and infra-red LED lights to obtain a photo-sequence of growing plants. We propose a segmentation algorithm based on convolutional neural networks (CNN) to extract the plant roots, and present a comparative study of three different CNN models for such task. Our goal is to generate a reliable graph representation of the root system architecture, useful to obtain descriptive phenotyping parameters

plant root segmentation, high-throughput phenotyping, cnns

1 - 24

Identifying pathogenic variants in non-coding regions of the human genome

Ben Omega Petrazzini, Fernando Lopez-Bello and Lucia Spangenberg

Ben Omega Petrazzini

"The World Health Organization (WHO) reports around 300M cases of rare diseases worldwide1, half of them are affecting children2. There are over 7000 different types3, most of which are genetically caused by low frequency Single Nucleotide Polymorphisms (SNPs). This makes them really hard to diagnose for traditional medicine4, which hinders any possible treatment. Fortunately, this same characteristic makes them suitable for bioinformatics approaches. Classic in-silico techniques work well for coding regions of the genome. However, non-coding (NC) regions are 49 times bigger and variants are difficult to classify due to the lack of biological evidence. This results in a massive dataset with little information, which makes it very difficult to asses. To address this problem we are developing a Machine Learning algorithm in order to prioritize pathogenic variants in NC regions of the genome. To that end, we use the ClinVar public database annotated with ANNOVAR as training set. Once the best model is fitted we will use it to reduce the number of NC variants of interest in patients with an undiagnosed rare disease. We believe this approach will accelerate the diagnosis process in rare disease patients, giving a mayor relief for the individual and its family."

genomic diagnosis, rare diseases, snp prioritization

1 - 25

In silico prediction of drug-drug interactions

Nigreisy Montalvo Zulueta, María Elena Ochagavía Roque, María Elena García Ochagavía

Nigreisy Montalvo Zulueta

Drug-drug interaction (DDI) is a change in the therapeutic effects of a drug when another drug is co-administrated. Early detection of these interactions is mandatory to avoid inadvertent side effects that can lead to failure of clinical treatments and increase healthcare costs. Computational prediction of DDIs has been approached as a classification problem where two classes are defined: interacting drug pairs (positive class) and non-interacting drug pairs (negative class). Positive DDIs are usually obtained from public databases that contain a list of validated DDIs, however negative DDIs are drug pairs generated randomly, due to the lack of a “gold standard” non-DDI dataset. In the present work we propose to perform a disproportionality analysis of FDA Adverse Event Reporting System (FAERS) with the aim of finding drug pairs that are often co-administrated and did not generate signal of interaction. We selected previous pairs as negative class examples. We calculated drug-drug pair similarity using nine biological features and finally applied five machine learning-based predictive models: decision tree, Naïve Bayes, Support Vector Machine (SVM), logistic regression and K Nearest Neighbors (KNN). SVM obtained the highest AUC value (0.77) based on tenfold cross validation.

bioinformatics, drug-drug interactions, supervised machine learning

1 - 26

Machine Learning Applied to Social Sciences: New Insights to Understand Human Behavior

Francielle M. Nascimento, Marielli Bittencourt, Henrique Carlos de Castro, Dante A. C Barone

Francielle Marques

"The research in Social Sciences is fundamental to the study of human behavior. Beliefs and motivations play an important role in people's decision-making and choices. This relationship is relevant to explain the behavior in a population, and therefore, it allows for outlining social actions to improve the community. Knowing this, we proposed a way to discover meaningful patterns from a database of social studies using state-of-the-art techniques of Artificial Intelligence and Social Sciences. In this context, we selected Social Activism to perform classification using the extensive Word Values Survey (WVS) database. The database used contain a survey applied in several countries, divided into periods called Waves. The Waves handled in this study were Wave 5 (2005-2009), Wave 6 (2010-2014), and Wave 7 (2018-2022). Thus, we discovered the patterns in the databases in the longitudinal view that make sense from the perspective of the Social Sciences. These patterns indicate the tendency of peoples of the world is concerned with issues of moral-ethical than other aspects, such as politics, for example."

application, explainable, computational social science

1 - 27

Machine learning based label-free fluorescence lifetime skin cancer screening.

Renan A. Romano*, Ramon G. T. Rosa, Ana Gabriela Salvio, Javier A. Jo, Cristina Kurachi.

Renan Arnon Romano

Skin cancer is the most prominent cancer type all over the world. Early detection is critical and can increase survival rates. Well-trained dermatologists are able to accurately diagnosis through clinical inspections and biopsies. However, clinical similar lesions are usually incorrectly classified. This work aims to screen similar benign and malignant lesions for both pigmented and non-pigmented types. Fluorescence lifetime images measurements were carried out on patients with dermatologist (and biopsy) diagnosed lesions. This technique does not require the addition of any markers and can be performed noninvasively. Metabolic fluorescence lifetime images were performed by using a Nd:YAG laser emitting at 355 nm to excite the skin fluorophores. Collagen/elastin, NADH and FAD emission spectral bands for both nodular basal cell carcinomas and intradermic nevus, as well as for melanoma and pigmented seborrheic keratosis were analyzed. Features were properly extracted from these lifetime decays and set as the input of a simple partial least squares discriminant analysis model. After splitting the train and test sets, it was possible to achieve around 0.8 of ROC area on the test set, both for melanoma and basal cell carcinoma discriminations.

label-free imaging, fluorescence lifetime imaging, computer aided diagnosis

1 - 28

Machine Learning for Humanoid Robot Soccer

Alexandre Muzio, Luckeciano Melo, Lucas Steuernagel, Marcos Maximo

Marcos Ricardo Omena de Albuquerque Maximo

ITANDROIDS is a robot soccer team from Aeronautics Institute of Technology, Brazil. This poster presents recent efforts by this team in applying machine learning for robot soccer. We present a convolutional neural network (CNN) based on the You Only Look Once (YOLO) system for detecting the ball and the goal posts of the soccer field of a humanoid robot competition. We also show efforts in applying deep reinforcement learning for learning motions and behaviors for a simulated humanoid robot. We used Proximal Policy Optimization (PPO), which is suited for continuous domain tasks, to learn a dribbling behavior which surpassed our own hand coded behavior. Moreover, we used the same algorithm to learn a high-performance kick motion. In the latter case, behavior cloning was used to bootstrap the training, which contributed to make the algorithm converge to a better local minimum.

machine learning, robotics, robot soccer

1 - 29

Neural Networks applied to small datasets: efficiency evolution of natural gas networks

de Meio Reggiani, Martin C.; Chiarvetto Peralta, Lucila L.; Viego, Valentina N. Viego and Brignole, Nélida B.

Chiarvetto Peralta Lucila Lourdes

In January 2002, price updates of Argentinian public services were halted. Based on a dataset provided by the natural gas regulatory authority, an Artificial Neural Network was employed to study the efficiency change in the natural gas transport system. The gas leakage, which served as a proxy of operating inefficiency, has been estimated by a Multilayer Perceptron. The model was trained using technology-related data from 2002 onwards, and the previous information was employed for leakage prediction, allowing for the comparison against the real value.

natural gas transport system, efficiency, Artificial Neural Network

1 - 30

Optimizing classifier parameters for insect datasets

Bruno Gomes Coelho, Andre Maletzke, Gustavo Batista

Bruno Gomes Coelho

Motivated by the real life problem of being able to identify and then selectively capture dangerous insects that transmit various diseases, this poster analyzes the random search method of optimizing the parameters of two of the most recommended machine learning algorithms, Support Vector Machines (SVM) and Random Forest (RF).

parameter optimization, insect classification, random search

1 - 31

Photometric redshifts for S-PLUS using machine learning techniques

Erik V. R. Lima, Marcus V. Costa-Duarte, Laerte Sodré Jr. & the S-PLUS collaboration

Erik Vinicius Rodrigues de Lima

The distance to celestial objects is a fundamental quantity for studies in astronomy and cosmology. Until recently, the only way to obtain this information was via spectroscopy, but the increasingly bigger surveys, with enormous amounts of data, made this approach infeasible. Therefore a new way to estimate the distance to objects, based on photometry, was developed. Photometric redshifts can be acquired for many objects in a time-inexpensive manner. In this work, the objective is to investigate how machine learning methods perform when using the 12 filter system of S-PLUS. Also, a comparison with the currently used method for this purpose (BPZ) is done. The results show that S-PLUS has potential to acquire accurate photometric redshifts using machine learning techniques.

photometric redshifts, galaxies, machine learning

1 - 32

Predicting Diabetes Disease Evolution Using Financial Records and Recurrent Neural Networks

Rafael T.Sousa, Lucas A. Pereira, Anderson S. Soares

Rafael Texeira Sousa

Managing patients with chronic diseases is a major and growing healthcare challenge in several countries. A chronic condition, such as diabetes, is an illness that lasts a long time and does not go away, and often leads to the patient's health gradually getting worse. While recent works involve raw electronic health record (EHR) from hospitals, this work uses only financial records from health plan providers to predict diabetes disease evolution with a self-attentive recurrent neural network. The use of financial data is due to the possibility of being an interface to international standards, as the records standard encodes medical procedures. The main goal was to assess high risk diabetics, so we predict records related to diabetes acute complications such as amputations and debridements, revascularization and hemodialysis. Our work succeeds to anticipate complications between 60 to 240 days with an area under ROC curve ranging from 0.81 to 0.94. This assessment will give healthcare providers the chance to intervene earlier and head off hospitalizations. We are aiming to deliver personalized predictions and personalized recommendations to individual patients, with the goal of improving outcomes and reducing costs

diabetes, rnn, self-attention

1 - 33

Prediction of Frost Events Using Machine Learning and IoT Sensing Devices

Ana Laura Diedrichs; Facundo Bromberg; Diego Dujovne.

Ana Laura Diedrichs

"In this poster, I would like to introduce a frost prediction already published in IEEE IoT Journal, https://doi.org/10.1109/JIOT.2018.2867333, preprint available in https://anadiedrichs.github.io/files/publications/2018-IoT-Diedrichs.pdf If there are time and space, I would like to share my experience using LSTM or GRU for a frost prediction system, a WIP project, so I can have the chance to get feedback about time series deep learning implementations."

frost, machine learning, precision agriculture

1 - 34

Reinforcement learning for Bioprocess Optimisation

Ilya Orson Sandoval Cárdenas, Panagiotis Petsagkourakis, Eric Bradford, Antonio del Rio-Chanona, Dongda Zhang

Ilya Orson Sandoval Cárdenas

"Bioprocesses have recently received attention to produce clean and sustainable alternatives to fossil-based materials. However, they are generally difficult to optimize due to their unsteady-state operation modes and stochastic behaviours. Furthermore, plant-model mismatch is often present. In this work we leverage a model-free Reinforcement Learning optimisation strategy. We apply a Policy Gradient method to tune a control policy parametrized by a recurrent neural network. We assume that a preliminary model of the process is available, which is exploited to obtain an initial optimal control policy. Subsequently, this policy is partially updated based on a variation of the starting model to simulate the plan-model mismatch."

reinforcement learning, optimization, bioprocesses

1 - 35

Skin tone and undertone determination using a Convolutional Neural Network model

M. Etchart, J. Garella, G. De Cola, C. Silva, J. Cardelino

Emanuele Luzio

In the makeup industry, skin products are recommended to a guest based on their skin color and personal preferences. While the latter plays a key role in the final choice, accurate skin color and foundation matching is a critical starting point of the process. Skin color and foundation shades are categorized in the industry by their tone and undertone. Skin tone is typically divided into 6 categories linked to epidermal melanin, called the Fitzpatrick scale, ranging from fair to deep, while undertone is usually defined by 3 categories, cool, neutral and warm. Other scales exist such as the Pantone Skin Tone Guide reaching 110 combinations of tone and undertone. Both tone and undertone can be well represented by a two-dimensional continuum or be discretized into as many ordered categories as desired. Non-uniform illumination, auto exposure, white balance and skin conditions (spots, redness, etc) all pose important challenges determining skin color from direct measurements of semi-controlled face images. Previous work has shown good results for skin tone classification in 3 or 4 categories, while undertone classification hasn't been yet addressed in the literature. We propose a solution for inferring skin tone and undertone from face images by training a CNN which outputs a two-dimensional regression score representing skin tone and undertone. The CNN was trained from face images with tone and undertone labeled in the discrete 6 tones and 3 undertone categories, mapped into a score for regression. This approach achieves an accuracy of 78% for skin tone and 82% for undertone. In addition, the score allows for a simplified matching scheme between skin tone/undertone and the foundation colors.

skin tone, regression, convolutional neural network

1 - 36

Stream-GML: a Generic and Adaptive Machine Learning Approach for the Internet

Juan Vanerio, Pedro Casas, Federico Larroca

Juan Vanerio

The application of AI/ML to Internet measurement problems has largely increased in the last decade; however, the complexity of the Internet as learning environment has so far hindered the wide adoption of AI/ML in the practice. We introduce Stream-GML learning, a generic stream-based, ensemble learning model for the analysis of network measurements. Stream-GML deals with two major challenges in networking: the constant occurrence of concept drifts, and the lack of generalization of the obtained learnings. To deal with concept drifts, Stream-GML relies on adaptive memory sizing strategies, periodically retraining the underlying models according to changes in the empirical distribution of incoming samples, or based on performance degradation over time. To deal with generalization of learning results and (partially) counterbalance catastrophic forgetting issues, Stream-GML uses as underlying model a novel stacking ensemble learning meta-model known as Super Learner (SL). The SL model performs asymptotically as good as the best input base learner, and provides a powerful approach to tackle multiple problems in parallel, while minimizing over-fitting likelihood. The SL meta-model is extended to the dynamic, stream setting, controlling the exploration/exploitation trade-off by reinforcement learning and no-regret learning principles.

stream learning, ensemble learning, network attacks

1 - 37

Supervised Learning Study of Changes in the Neural Representation of Time

Estevão Uyrá, Gabriela C. Tunes, Eliezyer F. de Oliveira, Marcelo S. Caetano, Marcelo B. Reyes

Estevão Uyrá Pardillos Vieira

"When counting time is essential for optimal behavior, animals must make use of some inner temporal representation to guide responses. This temporal representation can be instantiated in the neural activity, measured via intracranial recordings, and assessed in a parsimonious manner by machine learning techniques. We aimed to shed some light on the process of learning to time by studying how temporal representations develop in two relevant areas: the medial Pre Frontal Cortex (mPFC) and the Striatum. For this purpose, we used regression techniques to predict the time elapsed since a sustained response has started based on the neural activity. We then measured the performance of the regression algorithm, associating higher performance with better time representation. As expected, we found patterns of activity consistent with time representation in both areas. However, the effect of training was inverted between the areas, with mPFC's representation weakening while the Striatum's enhances, thus indicates a migration of dependencies from mPFC to the Striatum. Our findings, consistent with habit formation, suggest new directions of research for the timing community and illustrate the potential of machine learning in the study of neuroscience. Experiments were approved at CEUA-UFABC with protocol numbers 2905070317 and 1352060217."

representation, timing, neuroscience

1 - 38

Testing a simple Random Forest approach to predict surface evapotranspiration from remote sensing data

V. Douna, V. Barraza, F. Grings, A. Huete, N. Restrepo-Coupe and J. Beringer

Vanesa M. Douna

Evapotranspiration (ET), which is the sum of the water evaporated and transpired from the land surface to the atmosphere, is crucial to ecosystems as it affects the soil, the vegetation, the atmosphere and mediates their interaction. Modelling and quantifying it accurately is critical for sustainable agriculture, forest conservation, and natural resource management. Although ET cannot be remotely sensed directly, remote sensing provides continuous data on surface and biophysical variables, and thus it has been an invaluable tool for estimating ET. In this work, we have evaluated the potential of a Random Forest regressor to predict daily evapotranspiration in three sites in Northern Australia from daily in-situ meteorological data, and satellite data on leaf area index and land surface temperature. We have obtained satisfactory performances with RMSE errors around 1 mm/day (rRMSE around 0.3), which are comparable to those obtained in previous works by different methods. Sensitivity to variations in the training sample and the importance of the input variables have been analyzed. Our promising results and the simplicity of the method reinforce the relevance of deeply exploring this approach in other ecosystems at different temporal and spatial scales, aiming to develop a versatile and operative ET product.

evapotranspiration, remote sensing, random forest

1 - 40

Towards an AI Ecosystem in Bolivia

Oscar Contreras Carrasco

As a country, we are currently facing many challenges in the adoption of Artificial Intelligence at different levels. Whereas at this moment we don't have an formal plan towards AI implementation, there are several initiatives that intend to address key issues, such as Education on AI, as well as industry adoption. Additionally, several communities and study groups are tackling education on AI, as well as spreading the word about the benefits it can provide. Also, at an institutional level, there have been initial discussions to tackle AI adoption nationwide with key strategies at different levels. All in all, the purpose of this presentation will be to discuss these initiatives, as well as the current challenges and future plans for adoption of Artificial Intelligence in Bolivia.

artificial intelligence, bolivia, ecosystem

1 - 41

Towards Bio-Inspired Artificial Agents

Maria-Jose Escobar, Mauricio Araya, Hans Lehnert, Rodrigo Carrasco, Cristóbal Nettle, Arthur Leblois, Pablo Reyes,

María-José Escobar

The study of biological and sensory systems allows us to understand the principles of computations used to extract information from the environment inspiring new algorithms and technologies. Inspired in the retina computation, we propose visual sensors for automatic image/video equalization (Tone Mapping) and autonomous robot navigation. On the other hand, we will also analyze the cortical circuit associated with decision making, cortical-basal ganglia loop, to incorporate it into a robot controller. For this, we propose a model including tonic dopamine type D1 receptors, which modulates the robot behavior, in particular, in the balance between exploitation and exploration.

bio-inspired computation, retina, decision-making, artificial agents

1 - 42

Unsupervised domain adaptation for brain MR image segmentation through cycle-consistent adversarial networks

Julián Alberto Palladino, Diego Fernandez Slezak, Enzo Ferrante

Julián Alberto Palladino

"Image segmentation is one of the pilar problems in the fields of computer vision and medical imaging. Segmentation of anatomical and pathological structures in magnetic resonance images (MRI) of the brain is a fundamental task for neuroimaging (e.g brain morphometric analysis or radiotherapy planning). Convolutional Neural Networks (CNN) specifically tailored for biomedical image segmentation (like U-Net or DeepMedic) have outperformed all previous techniques in this task. However, they are extremely data-dependent, and maintain a good performance only when data distribution between training and test datasets remains unchanged. When such distribution changes but we still aim at performing the same task, we incur in a domain adaptation problem (e.g. using a different MR machine or different acquisition parameters for training and test data). In this work, we developed an unsupervised domain adaptation strategy based on cycle-consistent adversarial networks. We aim at learning a mapping function to transform volumetric MR images between domains (which are characterized by different medical centers and MR machines with varying brand, model and configuration parameters). This technique allows us to reduce the Jensen-Shannon divergence between MR domains, enabling automatic segmentation with CNN models on domains where no labeled data was available."

unsupervised domain adaptation, cyclegans, biomedical image segmentation

1 - 43

Using Deep Learning to make Black Hole Weather Forecasting

Roberta Duarte, Rodrigo Nemmen, João Paulo Navarro

Roberta Duarte

One way to describe black holes and how they affect the enviroment around them is to use numerical simulations to solve, for example, Navier-Stokes equations. The evolution of this system is turbulent and it has a really high computational cost. We suggest here to use deep learning to describe the evolution of this systems using inputs and output from previous simulations. In this way, we can train convolutional neural networks to understand the system and predict how they will be in future. In our project, we already have great results in prediction the enviroment around a black hole using convolutional neural networks.

black holes, convolutional neural networks, turbulence

1 - 44

UWB Radar for dielectric characterization

Magdalena Bouza, Andrés O. Altieri, Cecilia G. Galarza

Magdalena Bouza

Ultra-Wideband (UWB) radar signals are characterized for having both high frequency carrier and high bandwidth. This makes the scattered field from the targets when irradiated with UWB pulses highly dependent on the composition and shape of the target. In particular, we focus on recovering the permittivity of the target based on the measured scattered field. We are currently working on moisture detection, classifying samples into different categories. To this end, we designed and built an impulse radar UWB testbed that transmits a gaussian pulse of approximately 1ns duration and captures the target’s response to this excitation. We then process the measured signals and use it as input to our classification algorithms.

ultra-wideband, classification, electromagnetic scattering

1 - 45

Visualizing the viral evolution for untangling and predicting it

"G. Martínez, D. Simón, F. Tambasco, G. Moratorio, M. Vignuzzi, F. Lecumberry, M.I. Fariello "

Maria Ines Fariello

"Viral emergence of drug resistance can be monitored by deep sequencing over short periods of time. Due to its high mutation rate and short generation time, viruses represent a great model to study this phenomena. As it is highly probable to find several alleles of a viral population in a random position of the genome just by chance, the consensus allele will appear with high frequency and several codons at low frequency. We use Shannon's Entropy to represent codons' frequencies variability, reducing the data dimensionality significantly without losing key information related with underlying evolutionary processes. Entropy was decomposed given its rate of temporal evolution into two processes: Leading and Random Variations. Several statistical and machine learning analysis were applied to this data to clusterize sites in the genome based on their evolutionary behavior, and to differentiate among the three viral variants. Some of the outliers pinpointed by these methods were shown to be sites under selection by other authors. Altogether, we are testing new analysis tools and visualization methods for detecting relevant sites under ongoing selection in a rapid way. For example, to differentiate the evolution of viruses under a new environment, such as a new drug treatment. "

viral evolution, visualization, classificiation

1 - 46

BeMyVoice - Bringing down communication barriers

Diego G. Alonso, Alfredo Teyseyre, Luis Berdun

Diego G. Alonso

Nowadays, deaf-mute people have different communication problems, not only because of their condition but also because just a few people know sign language. These communication problems affect the education, employment, and social development of these people. With BeMyVoice, we aim to give deaf-mute people a way to improve their communication and, thus, their quality of life. In short, we propose a mobile app connected to a sensor that allows the automatic recognition of hand signs and their translation to text and voice.

Hand Gesture Recognition; Deep Learning; Natural User Interfaces; Egocentric Vision