05 de juliol 2016

Intel Basic assembly notation SIMD

 
Loading
  • movupd xmm0 ... (SSE move unaligned packed double into 128-bit )
  • vmovaps ymm0 ... (AVX move aligned packed single into 256-bit)

Operating
  • –vaddpd ymm1 ymm2 (AVX add packed double 256-bit)
  • –addsd(SSE Add scalar doubles–SSE, but NOT vector op!)

KEY
  • – v = AVX
  • – p, s = packed, scalar
  • – u, a = unaligned, aligned
  • – s, d = single, double


Source: http://www.cac.cornell.edu/education/training/ParallelFall2012 /Vectorization.pdf

04 de juliol 2016

Intel Data Alignment




  • SSE2 16 Byte
  • AVX 32 Bytes
  • Xeon Phi 64 Bytes


Alignment increases the efficiency of data loads and stores to and from the processor. When targeting the Intel® Supplemental Streaming Extensions 2 (Intel® SSE 2) platforms, use 16-byte alignment that facilitates the use of SSE-aligned load instructions. When targeting the Intel® Advanced Vector Extensions (Intel® AVX) instruction set, try to align data on a 32-byte boundary. (See Improving Performance by Aligning Data.) For Intel® Xeon Phi™ coprocessors, memory movement is optimal on 64-byte boundaries. (See Data Alignment to Assist Vectorization.)


https://software.intel.com/en-us/articles/explicit-vector-programming-best-known-methods

Intel Data Alignment




  • SSE2 16 Byte
  • AVX 32 Bytes
  • Xeon Phi 64 Bytes


Alignment increases the efficiency of data loads and stores to and from the processor. When targeting the Intel® Supplemental Streaming Extensions 2 (Intel® SSE 2) platforms, use 16-byte alignment that facilitates the use of SSE-aligned load instructions. When targeting the Intel® Advanced Vector Extensions (Intel® AVX) instruction set, try to align data on a 32-byte boundary. (See Improving Performance by Aligning Data.) For Intel® Xeon Phi™ coprocessors, memory movement is optimal on 64-byte boundaries. (See Data Alignment to Assist Vectorization.)


https://software.intel.com/en-us/articles/explicit-vector-programming-best-known-methods

19 de desembre 2013

Python 2.7 Compile

How to compile Python with static and dynamic libraries in a custom folder with UTF-16 enabled.

./configure --prefix=$HOME/usr/local --enable-shared --enable-unicode=ucs4 --with-pydebug
 
Otherwise you could get the following exception explained below:

http://docs.python.org/2.7/faq/extending.html#when-importing-module-x-why-do-i-get-undefined-symbol-pyunicodeucs2

25 d’octubre 2013

netCDF 4.3.0 with HDF4, HDF5 and parallel

The following arguments are required for netCDF in order to compile the code. This will allow the support of HDF4, HDF5 and parallel IO.

Compile HDF4 - 4.2.9

./configure --enable-shared --disable-netcdf --disable-fortran --prefix=/usr/local
 
Compile HDF5  - 1.8.11

CC=mpicc ./configure --enable-parallel --prefix=/usr/local --with-zlib=/usr/include --enable-hl --enable-shared

Compile netCDF - 4.3.0 with HDF4 and HDF5 with parallel

CPPFLAGS="-I/usr/local/include -I/usr/include/hdf" CXXFLAGS=-I"/usr/local/include -I/usr/include/hdf" FFFLAGS="-I/usr/local/include -I/usr/include/hdf" FCFLAGS="-I/usr/local/include -I/usr/include/hdf" LDFLAGS=-L/usr/local/lib FC=mpif90 CXX=mpicxx CC=mpicc ./configure --enable-hdf4 --enable-netcdf4 --enable-shared --enable-dap

11 d’abril 2012

Prolog Random

Petita nota de com cridar a la funció random amb Prolog:

A is random(23).

Enllaç a la doc

25 de febrer 2012

jQuery datepicker validation


És possible que la validació del datepicker de jQuery et doni error pels navegadors basats en el webkit -Chrome, Safari, ...-. Sembla ser que aquests navegadors agafen el dia com a mes i el mes com a dia, sent el format com a DD/MM/AAAA. Per solucionar aquest problema he fet el següent:

Modificar el fitxer jquery.validation.js on hi diu:
 

date: function(value, element) {
return this.optional(element) || Invalid|NaN/.test(new Date(value));
},

i canviar-ho per:

date: function(value, element) {
return this.optional(element) || /^\d{2}[\/-]\d{2}[\/-]\d{4}$/.test(value);
},

Ara la validació la fa amb expressions regulars i comprovant-ho amb el Google Chrome funciona corretament.