Monday 7 November 2022

Indian Law won't cover you being photographed secretly but there is still some hope, IPC Law and Order

Ref. to IPC (The Indian Penal Code), 

  1. https://www.indiacode.nic.in/  
  2. https://indiankanoon.org/

News Articles: 

https://www.edexlive.com/news/2018/jan/08/indian-law-wont-cover-you-being-photographed-secretly-but-there-is-still-some-hope-1864.html#:~:text=*The%20law%20makes%20it%20clear,person%20in%20the%20photograph%20is

Ref. to Article  19, 21 *The law makes it clear you are free to take pictures for private use of other people in public areas under the Constitution of India Article 19, however publishing a photo in a manner that might be "embarrassing, mentally traumatic" or causing "a sense of insecurity about the activities the person in the photograph is involved in" is illegal under Article 21.

Section 509 in IPChttps://indiankanoon.org/doc/68146/ 

Section 354- A,B,C and D

Defamation Case: It is an act of imputing any person with intent to tarnish the dignity of the person by hacking their mail account and sending e-mails using vulgar language to unknown persons

Article 19 in The Constitution Of India 1949

19. Protection of certain rights regarding freedom of speech etc
(1) All citizens shall have the right
(a) to freedom of speech and expression;
(b) to assemble peaceably and without arms;
(c) to form associations or unions;
(d) to move freely throughout the territory of India;
(e) to reside and settle in any part of the territory of India; and
(f) omitted
(g) to practise any profession, or to carry on any occupation, trade or business

Friday 16 September 2022

Newsletter of Zenith Gavels Club - Dazzle 2022


NEWSLETTER OF ZENITH GAVELS CLUB | DAZZLE 2022 | VOL 1 ISSUE 1



Prepared, Edited and Published by Venus Rinith  - Newsletter Editor and VPPR 2021-2022


Newsletter by Venus Rinith










QR Code to read this news letter

Thursday 19 May 2022

Cannot open ECP via On-Prem Exchange


Situation:

We had migrated all our employees to exchange online and decommissioned our on-prem exchange servers which was hosting user databases. Retained on-prem exchange server 2013 CU22 just for SMTP application relay purpose. Also have created a new on-prem database with 2 user mailbox. However, with that user we were not able to login to the ECP on-prem. 

No issues in login to the online Exchange Admin Centre. 

What errors do you see? 

:-( Something went wrong We can't get that information right now. Please try again later 


What's the environment and are there recent changes? 

Exchange server 2013 Cu22 on Windows 2012R2. 

Our emails (...domain) have been migrated to exchange online. 

We are using exchange on-prem for the application servers hosted on Azure to relay to onprem-exchange. Noticed that on-prem ECP wasn't accessible. Single database and two mailbox accounts on on-prem exchange

What have you tried to troubleshoot this? 

Verified on-prem database and 2 users accounts already available on on-prem exchange. but cannot login via https://localhost/ecp

Resolution:

You cannot access ECP, it is getting redirected to office 365 while accessing local host/ecp.

We checked the HTTP redirect on Default Frontend, there are no settings found.

We checked the HTTP redirect on ECP, we found no settings

We found HTTP redirect configured for OWA, redirected to office 365 portal

We unchecked the settings and were able to access ECP successfully.

Password Policies and Recommendations


As per Microsoft baseline recommendation, good practice to fine tune password lockout threshold from 15 invalid logon attempts to 10 invalid logon attempts

 





Minimum is password length is 8 characters as per Microsoft guidelines (Password policy recommendations - Microsoft 365 admin | Microsoft Docs)

Windows security baseline recommend configuring a threshold of 10 invalid sign-in attempts

Account lockout threshold (Windows 10) - Windows security | Microsoft Docs

 

Check Group Policies applied to a User Account and Computer

 

Resultant Set of Policy

There is a built-in tool called “Resultant Set of Policy” (RSoP) that simulates the policy settings applied to computers and users using Group Policy. It acts as a query engine that polls existing policies based on site, domain, domain controller, and organizational unit, and then reports the results of those queries.

To launch Resultant Set of Policy, press Win + R to fire up the Run dialog box, type rsop.msc, and press Enter.











The tool fires up and scans the active policies and displays them within the tool. You will still need to go through the folders to find out each active policy applied to the account and computer.














GPResult

Alternatively, there is also a command line called GPResult that you can also use to collect active Group Policy settings. Simply open a Command Prompt and run the following command.

gpresult /scope user /v

This is to search and show all the active policies applied to the current user. To find all policies applied to the PC, run the following instead in an elevated Command Prompt window.

gpresult /scope computer /v

Even more, you can use GPResult to gather Group Policy information applied to certain user account from a remote computer, such as below:

gpresult /c computername /u username /p password /user targetusername /scope user /r

Or, all Group Policies applied to a remote computer:

gpresult /c computername /u username /p password /scope computer /r

Note that the switch /r is to display RSoP summary data while /v is to display verbose policy information.


Saturday 16 October 2021

Supervised Learning - Classification/ Quiz - Decision Tree









Q No: 1

What is the final objective of Decision Tree?

  1. Maximise the Gini Index of the leaf nodes
  2. Minimise the homogeneity of the leaf nodes
  3. Maximise the heterogeneity of the leaf nodes
  4. Minimise the impurity of the leaf nodes

Ans: Minimise the impurity of the leaf nodes

In decision tree, after every split we hope to have lesser 'impurity' in the subsequent node. So that, eventually we end up with leaf nodes that have the least 'impurity'/entropy


Q No: 2

Decision Trees can be used to predict

  1. Continuous Target Variables
  2. Categorical Target Variables
  3. Random Variables
  4. Both Continuous and Categorical Target Variables

Ans: Both Continuous and Categorical Target Variables


Q No: 3

When we create a Decision Tree, how is the best split determined at each node?

  1. We split the data using the first independent variable and so on.
  2. The first split is determined randomly and from then on we start choosing the best split.
  3. We make at most 5 splits on the data using only one independent variable and choose the split that gives the highest Gini gain.
  4. We make all possible splits on the data using the independent variables and choose the split that gives the highest Gini gain.

Ans: We make all possible splits on the data using the independent variables and choose the split that gives the highest Gini gain.


Q No: 4

Which of the following is not true about Decision Trees

  1. Decision Trees tend to overfit the test data
  2. Decision Trees can be pruned to reduce overfitting
  3. Decision Trees would grow to maximum possible depth to achieve 100% purity in the leaf nodes, this generally leads to overfitting.
  4. Decision Trees can capture complex patterns in the data.

Ans: Decision Trees tend to overfit the test data


Q No: 5

If we increase the value of the hyperparameter min_samples_leaf from the default value, we would end up getting a ______________ tree than the tree with the default value.

  1. smaller
  2. bigger

Ans: smaller

min_samples_leaf = the minimum number of samples required at a leaf node

As the number of observations required in the leaf node increases, the size of the tree would decrease 


Q No: 6

Which of the following is a perfectly impure node?








  1. Node - 0
  2. Node - 1
  3. Node - 2
  4. None of these

Ans: Node - 1

Gini = 0.5 at Node 1

gini = 0 -> Perfectly Pure

gini = o.5 -> Perfectly Impure


Q No: 7

In a classification setting, if we do not limit the size of the decision tree it will only stop when all the leaves are:

  1. All leaves are at the same depth
  2. of the same size
  3. homogenous
  4. heterogenous

Ans: homogenous

The tree will stop splitting after the impurity in every leaf is zero


Q No: 8

Which of the following explains pre-pruning?

  1. Before pruning a decision tree, we need to create the tree. This process of creating the tree before pruning is known as pre-pruning.
  2. Starting with a full-grown tree and creating trees that are sequentially smaller is known as pre-pruning
  3. We stop the decision tree from growing to its full length by bounding the hyper parameters, this is known as pre-pruning.
  4. Building a decision tree on default hyperparameter values is known as pre-pruning.

Ans: We stop the decision tree from growing to its full length by bounding the hyper parameters, this is known as pre-pruning.


Q No: 9

Which of the following is the same across Classification and Regression Decision Trees?

  1. Type of predicted variable
  2. Impurity Measure/ Splitting Criteria
  3. max_depth parameter

Ans: max_depth parameter


Q No: 10

Select the correct order in which a decision tree is built:

  1. Calculate the Gini impurity after each split
  2. Decide the best split based on the lowest Gini impurity
  3. Repeat the complete process until the stopping criterion is reached or the tree has achieved homogeneity in leaves.
  4. Select an attribute of data and make all possible splits in data
  5. Repeat the steps for every attribute present in the data

  • 4,1,3,2,5
  • 4,1,5,2,3
  • 4,1,3,2,5
  • 4,1,5,3,2

Ans: 4,1,5,2,3

Friday 15 October 2021

EDA : Lower Back Pain

 Exploratory Data Analysis on Lower Back Pain

















Lower Back Pain

Lower back pain, also called lumbago, is not a disorder. It’s a symptom of several different types of medical problems. It usually results from a problem with one or more parts of the lower back, such as:

  • ligaments
  • muscles
  • nerves
  • the bony structures that make up the spine, called vertebral bodies or vertebrae

It can also be due to a problem with nearby organs, such as the kidneys.

According to the American Association of Neurological Surgeons, 75 to 85 percent of Americans will experience back pain in their lifetime. Of those, 50 percent will have more than one episode within a year. In 90 percent of all cases, the pain gets better without surgery. Talk to your doctor if you’re experiencing back pain.

In this Exploratory Data Analysis (EDA) I am going to use the Lower Back Pain Symptoms Dataset and try to find out ineresting insights of this dataset.


#pip install xgboost

if xgboost is throws errors

ModuleNotFoundError Traceback (most recent call last)


import os

os.getcwd()

os.chdir('C:\\Users\\kt.rinith\\Google Drive\\Training\\PGP-DSBA\\Jupiter Files')

# change working directory

dataset = pd.read_csv("backpain.csv")
dataset.head() # this will return top 5 rows 






# This command will remove the last column from our dataset.
#del dataset["Unnamed: 13"]
dataset.describe()





dataset.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 310 entries, 0 to 309
Data columns (total 13 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   pelvic_incidence          310 non-null    float64
 1   pelvic tilt               310 non-null    float64
 2   lumbar_lordosis_angle     310 non-null    float64
 3   sacral_slope              310 non-null    float64
 4   pelvic_radius             310 non-null    float64
 5   degree_spondylolisthesis  310 non-null    float64
 6   pelvic_slope              310 non-null    float64
 7   Direct_tilt               310 non-null    float64
 8   thoracic_slope            310 non-null    float64
 9   cervical_tilt             310 non-null    float64
 10  sacrum_angle              310 non-null    float64
 11  scoliosis_slope           310 non-null    float64
 12  Status                    310 non-null    object 
dtypes: float64(12), object(1)
memory usage: 31.6+ KB

dataset["Status"].value_counts().sort_index().plot.bar()


dataset.corr()




plt.subplots(figsize=(12,8))
sns.heatmap(dataset.corr())




sns.pairplot(dataset, hue="Status")



Visualize Features with Histogram: A Histogram is the most commonly used graph to show frequency distributions.
dataset.hist(figsize=(15,12),bins = 20, color="#007959AA")
plt.title("Features Distribution")
plt.show()




Detecting and Removing Outliers

plt.subplots(figsize=(15,6)) dataset.boxplot(patch_artist=True, sym="k.") plt.xticks(rotation=90)

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),
 [Text(1, 0, 'pelvic_incidence'),
  Text(2, 0, 'pelvic tilt'),
  Text(3, 0, 'lumbar_lordosis_angle'),
  Text(4, 0, 'sacral_slope'),
  Text(5, 0, 'pelvic_radius'),
  Text(6, 0, 'degree_spondylolisthesis'),
  Text(7, 0, 'pelvic_slope'),
  Text(8, 0, 'Direct_tilt'),
  Text(9, 0, 'thoracic_slope'),
  Text(10, 0, 'cervical_tilt'),
  Text(11, 0, 'sacrum_angle'),
  Text(12, 0, 'scoliosis_slope')])

Remove Outliers:
# we use tukey method to remove outliers.
# whiskers are set at 1.5 times Interquartile Range (IQR)
def remove_outlier(feature):
first_q = np.percentile(X[feature], 25)
third_q = np.percentile(X[feature], 75)
IQR = third_q - first_q
IQR *= 1.5
minimum = first_q - IQR # the acceptable minimum value
maximum = third_q + IQR # the acceptable maximum value

mean = X[feature].mean()
"""
# any value beyond the acceptance range are considered
as outliers.
# we replace the outliers with the mean value of that
feature.
"""
X.loc[X[feature] < minimum, feature] = mean
X.loc[X[feature] > maximum, feature] = mean

# taking all the columns except the last one
# last column is the label
X = dataset.iloc[:, :-1]for i in range(len(X.columns)):
remove_outlier(X.columns[i])


Feature Scaling:

Feature scaling though standardization (or Z-score normalization) can be an important preprocessing step for many machine learning algorithms. Our dataset contains features that vary highly in magnitudes, units and range. But since most of the machine learning algorithms use Euclidean distance between two data points in their computations, this will create a problem. To avoid this effect, we need to bring all features to the same level of magnitudes. This can be achieved 

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(X)
scaled_df = pd.DataFrame(data = scaled_data, columns = X.columns)
scaled_df.head()

Label Encoding:

Certain algorithms like XGBoost can only have numerical values as their predictor variables. Hence we need to encode our categorical values. LabelEncoder from sklearn.preprocessing package encodes labels with values between 0 and n_classes-1.

label = dataset["class"]

encoder = LabelEncoder()

label = encoder.fit_transform(label)

Model Training and Evaluation:


X = scaled_df y = label X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=0) clf_gnb = GaussianNB() pred_gnb = clf_gnb.fit(X_train, y_train).predict(X_test) accuracy_score(pred_gnb, y_test) # Out []: 0.8085106382978723 clf_svc = SVC(kernel="linear") pred_svc = clf_svc.fit(X_train, y_train).predict(X_test) accuracy_score(pred_svc, y_test) # Out []: 0.7872340425531915 clf_xgb = XGBClassifier() pred_xgb = clf_xgb.fit(X_train, y_train).predict(X_test) accuracy_score(pred_xgb, y_test) # Out []: 0.8297872340425532

Feature Importance:

fig, ax = plt.subplots(figsize=(12, 6)) plot_importance(clf_xgb, ax=ax)




















Marginal plot

A marginal plot allows us to study the relationship between 2 numeric variables. The central chart displays their correlation.

Lets visualize the relationship between degree_spondylolisthesis and class:

sns.set(style="white", color_codes=True)

sns.jointplot(x=X["degree_spondylolisthesis"], y=label, kind='kde', color="skyblue")