Wednesday, March 17, 2021

Machine Learning


Chapter 1

Introduction

Arthur Samuel (1959): Field of study that gives computers the ability to learn without being explicitly programmed.

Consider as a computer program is said to learn from experience (E), with respect to some task (T), and some performance measure (P): if its performance on T, as mesure by P, improves with experience E.

In other words, let's consider the situation where we have an email program that watches which emails you do or do not mark as spam, and based on that learns, how to better filter spam:

a) The task (T) is what I basically want to achieve, which means classifying emails as spam or not spam. It's the goal.

b) The performance measure (P) is me getting a number of emails correctly classified as spam or not spam. This is the result that I can obtain, since applying an algorithm.

c) And the experience (E) is the learning process, meaning that the emails labeled as spam or not will be watched.

The machine learning has two main types of learning algorithms methods: supervised and unsupervised learning. Others are: reinforcement learning and recommender systems.
 

Supervised Learning


In the supervised learning situation, the right answers are given. For example, I have an understanding of the price of houses per square feet in a certain are, and by collecting them, I can predict a price base for a specific house, with a specific size, that I want to sell. This might not be entirely accurate, but I'm getting closer to the value of my prediction.

However, if I can also creat a curve using predictive arguments, as a trend, or by using the prediction of continuous valued output (i.e. the price used in 2008 versus 2021), and with that I would then define what is called Regression. If instead I define a discrete valued output, that is called Classification.

A better way to understand the difference between Classification or Regression problems is by using examples:

a) You have an inventory of identical items, and you want to predict how many of these items you will sell over in the next 3 months. You are going to be defining what the items are, and the time, and you will be monitoring how much of them were sold int he past, and by using a predictive algorithm, you might get an approximate number of how many will be sold in the the stipulated period of time. This is Regression.

b) However, in the following example, I want to examine individual customer accounts, and for each account decide if it has been hacked or compromised. I can define a discrete valued 0 for not hacked and 1 for hacked, and once applying the algorithm, I can obtain the result for these two categories. This is Classification.

To remember then, when you define a data set classification with a discrete value (Classification) or by trend (Regression), we’re talking about supervised learning. 

Unsupervised Learning


The unsupervised learning case is when I'm not applying any methods. I have a data set, and based on that, I want to discover the result.

Good examples are news.google.com, that scans different news, and then group the similar ones in one single category, merging the most highlighted information in one single link.

Other one, is a data set of genes and different individuals. By applying an algorithm, it is possible to obtain a group of different races, or other information, by an autonomous learning process. This can happen not only with genome, but when organising computers, doing some social analysis or market segmentation, or astronomical data analysis.

The “cocktail party problem” expands a bit more on how, by a set of data, machine learning can help separating voice of two different people in two different categories for you. You don't choose which voice or language, who has the deeper or weaker voice, who is male or female, or even if it’s music or a person speaking in the background: the algorithm should be able to separate that in categories for you automatically, rather than you defining it at priori.

Therefore, the best way to remember this is, when not defining a classification, this is using a unsupervised method.


Sunday, August 27, 2017

Installing Dropbox in Kali

Please go to Dropbox page and download version for Ubuntu (file extension .deb) -- https://www.dropbox.com/install-linux.

This is going to download a file in this format:
dropbox_2015.10.28_amd64.deb

Then run:
wget https://www.dropbox.com/download?dl=packages/ubuntu/dropbox_2015.10.28_amd64.deb

After that:
sudo dpkg -i <package name>

Example package name:
download?dl=packages%2Fubuntu%2Fdropbox_2015.10.28_amd64.deb

Therefore:
sudo dpkg -i download?dl=packages%2Fubuntu%2Fdropbox_2015.10.28_amd64.deb

This will install the package.

For checking the GUI version of Dropbox just installed, please go to Usual Applications > Internet > and see the Dropbox appearing in there.

Tuesday, August 15, 2017

Further Carving

Not satisfied with Autopsy results, I have executed the application called foremost, and have got more files.

I run then Autopsy again, and found out that probably parameters chosen for the evidence09 were restrictive, as the second time, Autopsy capture the same amount of files with foremost.

The only issue is, foremost recover all the data not showing from which path the unallocated block was, therefore when using Autopsy, you can filter of what was under Internet Explorer for example, and ignored those files (most likely images downloaded temporarly whilst the user was browsing).

I've also had the curiosity of trying another tool, and this time I'm using PhotoRec. This one requires you mouting though the image. Therefore I did the following:

1) Create a mouting directory called /mnt/evi09mnt
2) Run the command to mount image:
mount -o ro,loop,offset=32256 evidence09_sdb.dd /mnt/evi09mnt
3) Download the PhotoRec latest version (this doesn't require installation, and you just have to download and unzip the files in a folder).
4) Run the file called ./photorec_static.

This will go and open the partition you want to test. You need to select a different folder for the results too.

In my test, 22420 files were recovered. I haven't filtered the files I wanted, so I went back and configured the following types of files to be recovered:

accdb (Access Data Base, as part of Office)
bk (MS Backup File)
bmp (BMP bitmap image)
doc (Microsoft Office Document: doc/xl/ppt/vsd/... for 3ds Max, MetaStock, Wilcom ES)
evt (Windows Event Log)
gif (Graphic Interchange Format)
http (HTTP Cache)
jpg (JPG picture)
key (Synology AES key)
mov/mdat Recover mdat atom as a separate file)
mov (mov/mp4/3gp/3g2/jp2)
mp3 (MP3 audio: MPEG ADTS, layer III, v1)
mpg (Moving Picture Experts Group video)
nsf (Lotus Notes)
one (Microsoft OneNote)
pcx (PCX bitmap image)
pdf (Portable Document Format, Adobe Illustrator)
png (Portable/JPEG/Multiple-Image network Graphics)
psb (Adobe Photoshop Image)
psd (Adobe Photoshop Image)
psf (Print Shop)
psp (Paint Shop Pro Image File)
pst (Outlook: pst/wab/dbx)
ra (Real Audio)
*rar (Rar archive)
reg (Windows Registry)
res (Microsoft Visual Studio Resource file)
riff (RIFF audito/video: wav, cdr, avi)
rm (Real Audio)
sqm (Windows Live messenger Log File)
tar (tar archive)
tif (Tag Image file Format and some raw file formats: pef/nef/dcr/sr2/cr2)
*tx? (Text files with header: rtf/xml/xhtml/mbox/imm/pm/ram/reg/sh/slk/stp/jad/url)
*txt (Other text files: txt/html/asp/bat/C/jsp/perl/php,py/emlx... scripts)
wks (Lotus 1-2-3)
xar (xar archive)
xml (Symantec encrypted xml files)
*zip (zip archive including OpenOffice and MSOffice 2007)


Monday, August 14, 2017

Confirming Serial Number of Hard Disk Image

Mount the image, then run the command lshw.

If this is not install, please run:

sudo apt-get install lshw

Further information here.

Mounting Images

For mounting images, first check the image structure, using the following command:

sfdisk -l evidence09_sdb.dd

You will obtain something similar to this:


The partition starts at the sector 63. This multiplied by it size (512) gets that the block we have to use is: 32256.

Therefore, the command to run is:

mount -o ro,loop,offset=32256 evidence09_sdb.dd /mnt/evi09mnt

Note: create the folder evi09mnt before.

Further information here


Thursday, August 10, 2017

Check hard disk activity

I all of sudden noticed that my hard disk is having a lot of activity, and I closed all my applications.

My VMWare box stopped working too. I'm concerned that someone accessed through my pc.

Therefore I'm going to monitor activity through iostat. Please install it first by doing the following:

sudo apt-get install iotop
sudo iotop --only

I can see that only myself is connected and root doing activities in background. Coincidently, s more responsive, but I can see that there are still activities running on behalf of vmware, where I already kill this window. 

Doing a:

sudo ps -ef | grep vwmare

I can see several tasks running on behalf this application, but they are just ports opened. 

I restarted everything again, and it seems to be working now. I'll keep monitoring.
 

Monday, August 07, 2017

Install and Uninstall in Linux: That is the Question (part 2)

As mentioned earlier in a previous post, installing packages can be complicated.

I just faced a new challenge, when trying to install WPS Office. First of all, when you access to the WPS download page, you can see a generic WPS Office for Linux link in there.

After clicking in Download, you have different options for Alpha21 for this year and last year. The options are .deb version and .rpm:



Just a bit of background on this RPM packages are precompiled and built for Red Hat Based Linux Distribution, and can be installed only using yum, Zypper and RPM based package managers,

Since Kali Linux is based on Debian, you cannot install an RPM package directly using apt or dpkg package managers.

As you can see in th elist above, the version for Debian is amd64. Doing an uname -a or uname -m, I can see my operating system is configured in x86_64 mode. For further information, please check this page here, that is quite handy...

Therefore, here are the steps to download and install WPS for x86_64:

1) download the version for amd64.deb listed above
2) rename it to something shorter, like wps-office.deb
3) sudo dpkg -i wps-office.deb
4) sudo apt-get -f install

This worked for me well.

Keyring Issue

I've just published another post where I mentioned that when trying to access to Google Chrome, I was being asked to put my keyring password.

I don't remember changing, and I was trying to use, what is my password to access to my box itself. Unfortunately this didn't work.

So if the password you remember doesn't work, you don't have too much alternative apart from deleting the password and keys there, and re-create this again.

For doing that, run:

sudo su
cd /home/aviola/.local
cp -r keyrings keyrings.backup
cd keyrings
rm login.keyring


You can perceive I did a backup. This is just in case one day you remember the password you use, or if you have any issues with other applications.

Then I access to Google Chrome again, and I setup with my password that I use to access to the box (which is quite complex). Once setting that up, I was able to login to Google Chrome afterwards without any issues.

Now, I'll start my other applications that I have to use for my disseration. I'm hoping everything else will work find, but if not, I'll definitely be writing back here!

Install and Uninstall in Linux: That is the Question (part 1)

I am still struggling, after all this time using Linux, to understand upgrades and ways to install and uninstall applications in my Kali Linux.

The issue probably is because I am using 3 operating systems at the same time: 1) Windows for Work; 2) Mac as base machine to connect to VDI at work or to use daily for research/entertainment; 3) Linux for University.

And let's face it: I haven't sat that much on my Linux box to do too much in these last few months on this computer, at all... This now changes, and this will be my friend for a month.

In Windows, to install you have to download the executable file or either go to Add/Remove Programs. For uninstalling, you can either choose the Uninstall executable to do all the job for you, or again go back to Add/Remove Programs for a complete cleanup. And for any upgrades, you can select the automatic upgrade that recommends you a new version when this is available.

To be honest, I even don't remember anymore if all these steps are correct, as my virtual machine at work don't allow me to install/uninstall applications freely, as the same for updating them accordingly. I'm relying on the IT guy/tools to do that remotely for me, and this is seemless.

In the Mac instead, you can install the app by just copying the file to your Apps 'icon', which basically copy the file you'll execute to an Applications folder, so you don't have all the files distributed everywhere. For uninstalling it, you just delete the file you execute, and voilà: the app is gone. Any upgrades are made through iTunes.

But in Linux... arghhh! For installing and uninstalling, it might be simpler as you just need to have in your had that the 'installer' is called apt-get, and then you just need to use functions to install and uninstall, if you obviously are 'installing' or 'uninstalling' the application from your box.

Example:
  
To install app:

apt-get install <app_name>

To uninstall app
apt-get uninstall <app_name>

However, if you want to upgrade your system, things get more complicated. Especially because not always is a very straight forward process, especially if the application you need to use requires different libraries, or you have public key issues to connect to the catalogs to download these libraries for example.

For checking the library, you need to do an apt-get update. This allows you to refresh your library against the global catalog for anything new. When I run this command to write this blog, I managed in fact to get the following error:


This is because under /etc/apt/sources.lists.d folder, I had the following files there:
- google-chrome.list
- playonlinux.list

These are extra libraries that were attached to my original library, and that every time I run apt-get update, these two are attempted to be updated as well, but it seems that the public keys used before are now unexistent for, for some reason.

I could try to sort the issue out, but first I don't want to lose too much time, and secondly, one is for Google Chrome, which if I break it, it's easy to fix. The other one I don't even remember when I did install that in the first place. So I simply decided to deleted these files from there, and after running the command apt-get update again, I didn't get the errors any longer.

However though, when I tried to check if Chrome was still working, I started to get a message asking for my keyring password to be able to use this application properly. I put the password I'm using for accessing to my box, which was the one I remembering setting up (bear in mind that it has been a while that I didn't connect to this box, as mentioned above), and this didn't work.

Therefore, I created another blog for explaining how to fix this issue here.

Once updating the library, then the other step is actually run the apt-get install, to install any new update found. This time I had nothing to upgrade, but once I have and if any issues come up, I'll update this post again.

You can also use the Package Manager for Linux called Synaptic. For installing that, you can run the following:
sudo apt-get install synaptic
sudo synaptic 

You can check for any updates by clicking in Reload, and then for downloading and installing, you can use the icon that says Mark All Upgrades.

Sunday, June 04, 2017

I was having several issues when trying to install applications such as LateX and others on my computer.

The errors I was obtaining were related to the grub-efi-amd64 and a possible library not found.
Setting up grub-efi-amd64 (2.02~beta3-5) ...
/var/lib/dpkg/info/grub-efi-amd64.config: 1: /etc/default/grub: et#: not found
dpkg: error processing package grub-efi-amd64 (--configure):
 subprocess installed post-installation script returned error exit status 127
Errors were encountered while processing:
 grub-efi-amd64
E: Sub-process /usr/bin/dpkg returned an error code (1)
Visiting several websites, I've found the following recommendation:

sudo apt-get purge grub\*
sudo apt-get install grub-efi
sudo apt-get autoremove
sudo update-grub
 
With this, I'm removing completely anything related to grub and then reinstalling and updating it afterwards.

When trying again to download and install the TexLive packages (according the instructions provided here), I no longer obtain any errors, and I was able to run TexMaker with no issues:

aviola@kali:~$ sudo apt-get install texlive-full
Reading package lists... Done
Building dependency tree       
Reading state information... Done
texlive-full is already the newest version (2016.20170123-5).
0 upgraded, 0 newly installed, 0 to remove and 532 not upgraded.

aviola@kali:~$ sudo apt-get install texmaker
Reading package lists... Done
Building dependency tree       
Reading state information... Done
texmaker is already the newest version (4.5-1).
0 upgraded, 0 newly installed, 0 to remove and 532 not upgraded.
aviola@kali:~$ 

Sunday, April 30, 2017

Grub and Gnome

Grub is the multi-loader and allow you to load multiple configurations or operating systems. Further general information about Kali can be read here.

Gnome is the official Ubuntu flavor (theme) and as being from the same family, Kali presents this as the official theme once installed.

For configuring this, please follow the instructions from this page.


Sunday, April 23, 2017

Autopsy

I downloaded and installed Autopsy in a Windows 7 that is running under virtual machine under VMWare. Please see my earlier post regarding how to configure VMWare under Kali Linux.

Once having the evidence in a dd image, you can go to the menu and create a new case under Autopsy. Once creating the case, you can add the data source. As explained earlier my VM is linked to Kali download folder, so the dd images are recognised from my main machine.

I have 10 images that I collected and that I am analysing one by one. The expected time per analsys is about 1 day. Once obtaining the results. Once completing the results, I will create a new post.

Hash Database Help

As per the Sleuthkit.org page, there are hash databases that can be used to identified known good and known bad files, by using the MD5 or SHA-1 checksum value.

The different databases are:

  • NIST NSRL
  • Ignore
  • Alert

Ignore and Alert databases require the investigator to create them. Instead, the NSRL one already contains a source of files that can be found in operating systems and software distributors. Therefore I will use the NIST NSRL database.

Because this does not require to be created, I still have to attach the downloaded the database and index it before it is used.

Following instructions from the Autopsy page1 and page2 I have first downloaded the file NSRL database from the Sourceforge page. For more configuration here.

Once downloaded the file, extract the files. You should be able to see 2 index files plus a Word document with instructions.

After extracting the file, you can go to Autopsy (now I have updated to 4.3.0) and go to Tools > Options > Hash Databases. Select the option Import database and then select the path used when you extracted the files.

In the path, you need to select the idx file and then click in Open. Under Type of database, please select Known (NSRL or other) option. This would show the NSRL database appearing in the list. Click in Apply and OK to complete.

Now, go to case, and select a new one case... proceed as a new case.

Sunday, January 22, 2017

Testing tools based on a NIST image

National Software Reference Library (NSRL) and the National Institute of Standards and Technology (NIST) had work together in a project for collecting software from various sources and incorporate file profiles computed from this software into a Reference Data Set (RDS) of information.

The RDS can be used by law enforcement, government, and industry organisations to review files on a computer by matching file profiles in the RDS. This will help to alleviate much of the effort involved in determining which files are important as evidence on computers or file systems that have been seized as part of criminal investigations.

The RDS is a collection of digital signatures of known, traceable software applications. There are applications hash values in the hash set which may be considered malicious, i.e. steganography tools and hacking scripts.

Basically the idea is to load this products and understand if the tools that I decided to use do not change the or alter the evidence. Further information about the project and the NSRL can be found here.


Computer Forensics Tools Introduction


This was created by NIST in order to create a testing program for computer forensics tools. The main goal is to determine how well these tools perform core forensics functions such as imaging drivers and extracting information from devices. Further information can be found here.

For creating image, as mentioned earlier, I decided to use the tool called DCFLDD, and tests against this product are listed under the page 12. This says that there are some issues with this tool when there is some anomaly found in the disk. The complete accurate statement says the following:

  • When a drive with faulty sectors was imaged (test case DA-09) the tool failed to completely acquire all readable sectors near the location of the faulty sectors. In test case DA-09, a source drive with faulty sectors was cloned to a target drive. Readable sectors that were near faulty sectors on the source drive were not acquired. The tool wrote zeros to the target drive in place of these sectors.
  • When a drive with faulty sectors was imaged (test case DA-09) the data cloned to the target drive became misaligned after faulty sectors were encountered on the source drive. For example, sector 6,160,448 on the target drive cont ained the contents of sector 6,160,392 from the source, sector 6,160,449 on the target contained the contents of source sector 6,160,393, and so on. The size of the offset or misalignment between the data on the source and target drives grew as more faulty sectors were encountered on the source.
Full report is not longer available on the cyberfetch.org website.

For  the deleted file recovery, I decided to use The Sleuth Kit (TSK) / Autopsy. The report for this product is listed under page 216, and it says that under certain circumstances, the information cannot be recovered successfully from the image. The causes for this might be:

  • The data are no longer present in the image, e.g., overwritten.
  • Sufficient meta-data to locate the data is not present or reachable.
  • The algorithm implemented by the tool does not use the meta-data that locates the missing data.
  • The implementation of the tool algorithm is incorrect.

Report is also no longer available on the cyberfetch.org website.

Wednesday, April 27, 2016

VMWare in Kali

For running VMWare in Kali, please follow the most recent instructions in the VMWare website.

Once installing VMWare the most recently issue I face is regarding the installation of VMWare Tools and having the local folders being shared with the virtual machine.

This is extremely helpful mainly when downloading documents, as when doing that from the virtual machine, because the network card is in a loopback, the internet connection can be very slow.

So I downloaded Autopsy 4.0.0 in my local machine and then I could share the Downloads folder with the virtual machine, and it worked as a sharm.

The other reason why I wanted to have Share Folders enabled is because I can do the analysis from Autopsy directly reading files from the /mnt folder.

Some of the links are attached to link.

But the main issue I was facing was regarding that Kali does require to have root permissions for running everything, therefore for enabling Share Folders, you need to open this application using root as well.

In order to do that, please run the following commands:

sudo vmware (one being used for dissertation)
or
sudo vmplayer

Checking version

For checking version it is possible to run two commands:


Monday, April 25, 2016

Packaging tool

Kali does not come by default with the graphical user interface to manage package repositories.

This would allow to install, remove, upgrade and downgrade single and multiple packages.

Therefore when a deb or RPM are downloaded, the double click action on the file does not install automatically the package.

In order to get this sorted, it is necessary to install the package manager called Synaptic, using the following command:


Then:



Creating an image

For checking if the disk attached has been recognised by the OS, I run: fdisk -l:


It is possible to see the disk attached under the name /dev/sdb. By running ls -lha I can see all the partitions the same HDD has:


Then I executed the read only permissions to the disk, so there is no possibilities of causing some damage to the evidence:


Then executed the command dcfldd:


The results are:



Computers

Computer 01



Computer 02



Computer 03



Computer 04



Computer 05



Computer 06


 Computer 07



Computer 08


Computer 09




Computer 10




Other details about auctions can be found here:
Computer's auction

Accessing to Files on Android device

When trying to access to an Android device, I use the method by mounting it via Files. However, when trying to be able to access to the symbolic link called /storage/emulated/0/DCIM/Camera, this does not appear in there.

I went through several articles about this, and this looks much simple than actually what everyone is saying in the forums.

By default, Samsung devices use MTP or Media device Transfer Protocol instead of USB Mass Storage as any other USB driver. Therefore when running the command lsusb, you will be able to see the device attached to the computer:



However, the command mount does not display the device connected as a USB driver (normally displayed under /dev/sdaX), but it shows as a gvfsd-fuse under /run/user/1000/gvfs:

 
Note: You need to click on the device in File to mount it or mount it manually via command line for the next steps... And on the device you need to acknowledge the USB connection using MTP.

The device should be mounted under the following way:
mtp://[usb:001,021]/.

Once the device is mounted, under the folder /run/user/1000/gvfs, it's possible to see the host attached to it:


Under this I can then see the same folder called Phone that appears in Files.

However, by accessing to the folder Phone and then running ls -l, I still see the same folders that are shown in Files, and the physical link associated to the symbolic link called /storage/emulated/0/DCIM/Camera is still not there.

Therefore I run ls -la, and this in fact shows me the folder called DCIM, and in it, the folder Camera can be found:


For safe removal of the device attached from the computer, unmount and eject it from Files, or using the command line.





Sunday, April 24, 2016

Logical volumes and mounting partitions

When attaching a storage device (flash drive, external hard disk, etc) to a computer/operating system (OS), this is visible immediately. It is possible to check this physical presence by running fdisk -l. 

However, by attaching this device to the computer, this does not mean that I will be able to access to it straight away. This is because the OS is not aware of the filesystem (or directory tree) being used on this attached device, so the OS does not know how to see the information there, or access/write to it.

Therefore, it is necessary to mount this physical device by creating a logical access, and using a specific filesystem that is related to this device. As soon this happens, the OS can immediately read and write information here.

In other words
Physical device -> Set Filesystem type -> Mount Logical access -> OS can read it!


For example, when attaching an USB driver to my computer, and after running fdisk -l, I can see the physical device could be detected and this becomes physically available by what it shows in the below list:

 
I also will be able to see the driver listed in my Files as STORE N GO (name of the device):

I can also see precisely what are all the physical associations I have in the system, by running/ls -l /dev/disk/by-id:


If I only want to list all my USB devices, then I can run the command lsub:

 


Now, if I try to access to this (normally by simply doing a cd /media/aviola/STORE\ N\ GO, this driver will appear as non existent:


In order to make this accessible, I need then to mount this device. I can simply right click on the top of STORE N GO in Files, and then click in the option Mount.

If then I run the command mount in the Terminal, I will be able to see all the storage devices and their logical associations. I can see that /dev/sdb1 is now listed as /media/aviola/STORE N GO.

For umounting this drive again, I can use umount /media/aviola/S <double tab> to get the name completed (this is because there are some spaces and there is the need to use back slash before each space).

The easier way to mount a storage device is by using Files, but what if I want to have more control of what I am doing, and use a command line for that?

I can use the command mount and some parameters to make the logical access to this device available.

Initially I have to create a mount point. Before it was called STORE N GO under media/aviola. I will create something else in /media/external now with the command:
sudo mkdir /media/external

For then attaching the physical device to this mounting partition I can use:
mount /dev/sdb1 /media/external

I did not select any parameter such as if this device has a vfat or ntfs partition. I left the operating system decide that for me. By running mount I can see that the device was mounted correctly.


Please read the article called "Accessing to Files on Android device" for futher information to have access to USB connection using MTP.